hsx group repos are open - perm removal integration is complete
Coleen Phillimore
coleen.phillimore at oracle.com
Mon Sep 17 14:15:24 PDT 2012
On 9/17/2012 4:54 PM, Kelly O'Hair wrote:
> So I need some feedback from the hotspot team as to what the behavior should be now.
>
> Assuming we are continuing with the multiple jobs feature of JPRT... Everybody ok with that???
Love it!
>
> There is a job queue for each system (all the jobs that were submitted and have not finished).
> There is a running queue for each system (all jobs actively building or testing).
>
> The old "rule" has been that any job pushing to the same repository on any queue should not progress (or block) until
> the other job completes, this is currently enforced with a lock (directory called "lockdir") in the
> NFS area /net/prt-archiver/.../locks. (If anyone has a better lockdir mechanism, please let me know).
It has to be global somewhere.
>
> I started rejecting jobs that would block a while back because it was severely reducing throughput
> for everyone else. Now that JPRT can run multiple jobs, I have alternatives.
I didn't notice this change. Maybe it's not as frequent as it has been
today.
>
> I can let the job take up a slot in the running job list, and just let it block, that preserves it's
> priority when it does start running for that queue, but I have no easy way to prioritize it against
> all other jobs in all the queues, e.g. all that are blocked on the same repo.
> If that is acceptable, please let me know.
Are you saying that if one job is running and there are more than one
blocked for the same repository, it's really just luck which one gets
the lock next? I think that's fine. The queue length for any given
repository won't be that long so it won't be starved, and watching the
queue to resubmit is a real pain.
If there is a non-trivial merge, you kick out the job, right?
>
> Currently rejecting it at submit is bad, I agree, I will try and look into fixing that.
Thank you!
Coleen
> -kto
>
> On Sep 17, 2012, at 1:07 PM, Coleen Phillimore wrote:
>
>> Bengt you got a "Could not acquire lock for parent" because I am pushing to hotspot-gc and I think the behavior now is to lock out the second job rather than just integrate. This should be fixed.
>> Also my job has been stuck in the JPRT east queue for 3 hours. If I kill it and you submit, I'll have to wait for your job again to even start mine.
>> Tim's looking at this now.
>>
>> Coleen
>>
>> On 9/17/2012 4:00 PM, Bengt Rutisson wrote:
>>> On 2012-09-17 21:42, John Coomes wrote:
>>>> Vladimir Kozlov (vladimir.kozlov at oracle.com) wrote:
>>>>> John, did you unlock repos?
>>>> Yes. The config on the servers looks ok (no restrictions), and Bengt
>>>> was able to push to hotspot-gc this morning.
>>> Yes, I was able to push once this morning. But now I also get " Could not acquire lock for parent" when I try to push:
>>>
>>> http://prt-web.us.oracle.com//archive/2012/09/2012-09-17-191527.brutisso.hs-gc-g1-gc-timestamp/syslogs/2012-09-17-191527.brutisso.hs-gc-g1-gc-timestamp.log
>>>
>>> Bengt
>>>
>>>>>> At this point, all the hsx repos will be identical, and *locked*
>>>>>> except for perm removal changes.
>>>>> I see that few JPRT jobs are filed with:
>>>>>
>>>>> "Fail/kill comment: Could not acquire lock for parent:"
>>>> This looks like something strange with jprt. One of Zhengyu's jobs
>>>> (2012-09-14-170744.zhgu.hotspot) is marked as failed, but it actually
>>>> pushed the changeset to hotspot-gc, and I don't see any error in the
>>>> jprt logs.
>>>>
>>>> -John
>>>>
>>>>> John Coomes wrote:
>>>>>> The perm removal changes have been pushed up to hotspot-main and
>>>>>> pulled down into all the hotspot-{group} repos, so the group repos are
>>>>>> now open for normal work. Thanks for your help and patience.
>>>>>>
>>>>>> Please avoid pushing to hsx/hotspot-main for a few more days while we
>>>>>> run PIT, in case there are some fixes required to allow integration
>>>>>> into jdk8.
>>>>>>
>>>>>> -John
More information about the hotspot-dev
mailing list