RFR: 8341436: containers/docker/TestJcmdWithSideCar.java takes needlessly long to run [v3]
duke
duke at openjdk.org
Mon Oct 7 19:56:35 UTC 2024
On Mon, 7 Oct 2024 19:37:51 GMT, Sebastian Lövdahl <duke at openjdk.org> wrote:
>> The fix is twofold.
>>
>> 1. Stop the main container after an iteration is done. The main container is started with its runtime defined as 120 seconds, which means that each iteration takes 120 seconds. In reality, one iteration takes a few seconds while 115 seconds is spent waiting on the main container exiting.
>>
>> 2. Change the name of the main container to be unique per iteration. Containers are started with `--rm`, which means they are removed after exiting. However, the removal is done asynchronously _after_ the `stop` command has returned. This means that the second iteration may get an error if the same container name is used if the removal was not done before the container is started in the next iteration.
>>
>> On my machine, this cuts down the test runtime using Podman from 4m 13s to 17s. Using Docker, the runtime goes from 4m 15s to 41s.
>>
>> Podman only runs half the test cases (since JDK-8341310) which explain some of the difference. But there is also something strange going on in the Docker case; every `docker stop` call takes 10 seconds, and I have not been able to figure out what exactly causes it.
>>
>> Doing a manual `kill [container Java process PID]` gracefully terminates the Java process and container, but `docker stop` never does. Instead, it blocks for 10 seconds before abruptly killing the process using `SIGKILL`. I confirmed this with a simplified case and both
>> `strace -e 'trace=!all' docker run --init eclipse-temurin:23 java ..` and `strace -e 'trace=!all' docker run eclipse-temurin:23 java ..`, no signals were ever visible when calling either `docker stop` or `docker kill`.
>>
>> https://www.docker.com/blog/docker-best-practices-choosing-between-run-cmd-and-entrypoint/ and "What is PID 1 and why does it matter?" talks about why [`--init`](https://docs.docker.com/reference/cli/docker/container/run/#init) is supposed to help.
>
> Sebastian Lövdahl has updated the pull request incrementally with one additional commit since the last revision:
>
> Have EventGeneratorLoop end after a more predictable duration
@slovdahl
Your change (at version cdf92697b4e9bf6c47138db38e9c2cd22ddf558f) is now ready to be sponsored by a Committer.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/21331#issuecomment-2397766398
More information about the serviceability-dev
mailing list