[BUG] Hybrid Experiment Creation fails on large enough Query Sets

### What is the bug?
Running a Hybrid Experiment fails when running on a large enough Query Set and Judgments. The error message stored with the experiment is not informative.

The error seems to be triggered due to the parallel processing done at the ExperimentVariant level, searches are performed in parallel and the population of results is also performed in parallel. If the cluster cannot keep up with the requests, new requests will start being rejected.

### How can one reproduce the bug?
On a local cluster first run the script `demo_hybrid_optimizer.sh`. It will create the query sets and judgments that we need to trigger this bug. Then, using the UI, create a Hybrid Experiment that uses the ESCI Query Set (150 queries) and ESCI judgments.

When completed, the experiment will display status `ERROR`. The error message in the experiment document is empty.

In the backend logs one can find:
```
opensearch_search_relevance  | [2025-06-30T09:29:15,686][ERROR][o.o.s.t.e.PutExperimentTransportAction] [opensearch] Failed to process metrics for experiment: 5921f9bc-6ad7-4af6-a0a4-d50e5c46e297
opensearch_search_relevance  | org.opensearch.action.search.SearchPhaseExecutionException: all shards failed
opensearch_search_relevance  |  at org.opensearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:775) ~[opensearch-3.1.0.jar:3.1.0]
opensearch_search_relevance  |  at org.opensearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:395) ~[opensearch-3.1.0.jar:3.1.0]
opensearch_search_relevance  |  at org.opensearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:815) ~[opensearch-3.1.0.jar:3.1.0]
```

The reason for this failure seems to be:
```
opensearch_search_relevance  | Caused by: org.opensearch.OpenSearchException$3: rejected execution of org.opensearch.common.util.concurrent.TimedRunnable@cd7dc6d on QueueResizableOpenSearchThreadPoolExecutor[name = opensearch/search, queue capacity = 1000, org.opensearch.common.util.concurrent.QueueResizableOpenSearchThreadPoolExecutor@14b4f462[Running, pool size = 13, active threads = 13, queued tasks = 1000, completed tasks = 296]]
opensearch_search_relevance  |  at org.opensearch.OpenSearchException.guessRootCauses(OpenSearchException.java:716) ~[opensearch-core-3.1.0.jar:3.1.0]
opensearch_search_relevance  |  at org.opensearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:393) ~[opensearch-3.1.0.jar:3.1.0]
opensearch_search_relevance  |  ... 80 more
opensearch_search_relevance  | Caused by: org.opensearch.core.concurrency.OpenSearchRejectedExecutionException: rejected execution of org.opensearch.common.util.concurrent.TimedRunnable@cd7dc6d on QueueResizableOpenSearchThreadPoolExecutor[name = opensearch/search, queue capacity = 1000, org.opensearch.common.util.concurrent.QueueResizableOpenSearchThreadPoolExecutor@14b4f462[Running, pool size = 13, active threads = 13, queued tasks = 1000, completed tasks = 296]]
opensearch_search_relevance  |  at org.opensearch.common.util.concurrent.OpenSearchAbortPolicy.rejectedExecution(OpenSearchAbortPolicy.java:67) ~[opensearch-3.1.0.jar:3.1.0]
opensearch_search_relevance  |  at java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:841) ~[?:?]
opensearch_search_relevance  |  at java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1376) ~[?:?]
opensearch_search_relevance  |  at org.opensearch.common.util.concurrent.OpenSearchThreadPoolExecutor.execute(OpenSearchThreadPoolExecutor.java:131) ~[opensearch-3.1.0.jar:3.1.0]
opensearch_search_relevance  |  ... 61 more
opensearch_search_relevance  | [2025-06-30T09:29:15,782][INFO ][o.o.s.t.e.PutExperimentTransportAction] [opensearch] Updated experiment 5921f9bc-6ad7-4af6-a0a4-d50e5c46e297 status to ERROR
```

### What is the expected behavior?
The experiment creation should work or it should decline to create the experiment if there are too many queries (150 queries is not much though).

### What is your host/environment?
n/a

### Do you have any screenshots?
n/a

### Do you have any additional context?
n/a


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] Hybrid Experiment Creation fails on large enough Query Sets #158

What is the bug?

How can one reproduce the bug?

What is the expected behavior?

What is your host/environment?

Do you have any screenshots?

Do you have any additional context?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] Hybrid Experiment Creation fails on large enough Query Sets #158

Description

What is the bug?

How can one reproduce the bug?

What is the expected behavior?

What is your host/environment?

Do you have any screenshots?

Do you have any additional context?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions