python - random.sample() generating same sequence every time it is run - Stack Overflow

IT技术

更新时间：2025-03-142

admin管理员组
文章数量:1313333

I run this code on two machines:

from apscheduler.schedulers.asyncio import AsyncIOScheduler

# this part is simplified. It is only here to show how scheduler is basically initialized (for context)
scheduler = AsyncIOScheduler(timezone=utc)
scheduler.start()

# This is real code (with exception of the list)
@scheduler.scheduled_job('interval', minutes=1, misfire_grace_time=None)
async def do_dada_news():
    pages = [...] # shortened for better readability. It is longer than 20 elements
    print("---")
    for page in random.sample(pages, min(len(pages), 20)):
        print(page)

On both machines I get different outputs which are strange:

Local docker container: I get 20 different lines every time do_dada_news() runs.
Kubernetes cluster: I get the exact same 20 lines every time it is run.

I expect both machines to have the same behavior. How can this be such a different behavior?

To temporarily fix the problem, I now do random.seed(time.time()*10000) inside do_dada_news(). But that does not feel right.

I run this code on two machines:

from apscheduler.schedulers.asyncio import AsyncIOScheduler

# this part is simplified. It is only here to show how scheduler is basically initialized (for context)
scheduler = AsyncIOScheduler(timezone=utc)
scheduler.start()

# This is real code (with exception of the list)
@scheduler.scheduled_job('interval', minutes=1, misfire_grace_time=None)
async def do_dada_news():
    pages = [...] # shortened for better readability. It is longer than 20 elements
    print("---")
    for page in random.sample(pages, min(len(pages), 20)):
        print(page)

On both machines I get different outputs which are strange:

Local docker container: I get 20 different lines every time do_dada_news() runs.
Kubernetes cluster: I get the exact same 20 lines every time it is run.

I expect both machines to have the same behavior. How can this be such a different behavior?

To temporarily fix the problem, I now do random.seed(time.time()*10000) inside do_dada_news(). But that does not feel right.

Share Improve this question edited Feb 1 at 20:27 Péter Szilvási 2,1112 gold badges25 silver badges46 bronze badges asked Jan 30 at 22:04 FEZ 211 bronze badge

Seeding the RNG from the time is the normal way to get a different random sequence on each run. – Barmar Commented Jan 30 at 22:11
But why do consecutive calls to random.sample() create different results on one system and always the same on the other? Consecutive calls to any random function usually create different results without seeding inbetween – FEZ Commented Jan 30 at 22:20
Of course you get different results on consecutive calls, it wouldn't be random if you didn't. Seeding just sets the starting point. As for why you get different results on each system, it could be a difference between Docker and Kubernetes. – Barmar Commented Jan 30 at 23:23

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

If no seed is provided for pythons built-in random then it will use os.urandom() to set the seed. Crucially, if the operating system (Linux and Windows both do this) has a built in source of randomness it will default to using that instead of just using the system time.

While you could mess with the Linux configuration settings, it would be much easier just to initialize a random seed with random.seed(int(time.time())**20%999979).

Linux in particular uses an entropy pool as the source of randomness, and there's a suggestion here that the issue might be ameliorable with an upgrade to 5.6. In general though the entropy pool will require a short delay in order to generate the randomness needed.

If I was very concerned about not having this issue in future, I would set up a queue and create a function that when called returns the top number from the queue, deques it, and then adds a new random number to the bottom of the queue based on the mod-product of the numbers still in it. That way you shouldn't should be at least guaranteed a source of randomness that you control.

本文标签： pythonrandomsample() generating same sequence every time it is runStack Overflow

版权声明：本文标题：python - random.sample() generating same sequence every time it is run - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1741934215a2405754.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

python - random.sample() generating same sequence every time it is run - Stack Overflow

1 Answer 1

更多相关文章