Random values from NVML functions
Ty Young
youngty1997 at gmail.com
Thu May 14 00:21:59 UTC 2020
On 5/13/20 7:01 PM, Maurizio Cimadamore wrote:
>
> Btw - nice-looking app! (I looked at the pic :-) )
>
Thanks!
> If I understand correctly, the place where you are getting garbage
> values out of is this:
>
> https://github.com/BlueGoliath/java-nvidia-bindings/blob/master/modules/org.goliath.bindings.nvml/src/main/java/org/goliath/bindings/nvml/main/nvml_h.java#L426
>
> More specifically, after the call, the array of
> nvmlProcessUtilizationSample_t doesn't contain what you think it
> should contain. Am I correct?
>
Right, almost as if the memory isn't being sliced correctly. Although,
I'm not sure how incorrectly sliced memory, if zero'd, would give those
numbers to begin with.
> Can I see the client code which calls this function, so that I can
> take a look at all the pieces?
>
Of course:
https://github.com/BlueGoliath/GoliathEnviousNative/blob/master/modules/org.goliath.envious.nvml/src/main/java/org/goliath/envious/nvml/local/attributes/NVMLGPUProcessAttributeData.java
Be warned though, the code isn't as pretty as the GUI.
> Thanks
> Maurizio
>
> On 14/05/2020 00:55, Ty Young wrote:
>>
>> On 5/13/20 6:38 PM, Maurizio Cimadamore wrote:
>>> Hi,
>>> is this a regression? E.g. did this work before and now it started
>>> behave differently all of a sudden (e.g. after a rebuild on panama)
>>> or is this a new function you are trying to call and you are getting
>>> an odd behavior?
>>
>>
>> Not sure.
>>
>>
>> After converting everything to FMA from pointer it started giving me
>> 0 for everything where the Pointer API would give me seemingly
>> correct non-zero values the majority of the time, but would sometimes
>> give random garbage. Because the old Pointer API never zero'd memory
>> I have no idea if those values were valid or not, so I didn't think
>> much of always getting 0.
>>
>>
>> Yesterday I did some cleanups in the OO code(layer under JavaFX),
>> including converting NativeValue<Integer> instances to
>> NativeInteger(same for longlong) and it started doing this, which I
>> think is partially correct: if I start a GPU benchmarking
>> application(Unigine Superposition) and view the processes content in
>> the GUI, I do see seemingly correct utilization rates that match
>> in-app On-Screen-Display FPS.
>>
>>
>> The issue is with Memory Utilization and Video encoder/decoder
>> Utilization.
>>
>>
>>>
>>> Maurizio
>>>
>>> On 14/05/2020 00:00, Ty Young wrote:
>>>> Hi,
>>>>
>>>>
>>>> Currently I'm getting random values[1] from this NVML function[2].
>>>> I've spent a few hours dumping sizes and re-checking my abstraction
>>>> layer code in order to figure out why it's doing this but am not
>>>> seeing anything. I'm wondering if there ware any recent bug fixes
>>>> in FMA that might cause this that were fixed. If not I'm going to
>>>> have to try asking on the Nvidia forums.
>>>>
>>>>
>>>> For reference, the function binding can be found here:
>>>>
>>>>
>>>> https://github.com/BlueGoliath/java-nvidia-bindings/blob/master/modules/org.goliath.bindings.nvml/src/main/java/org/goliath/bindings/nvml/main/nvml_h.java#L426
>>>>
>>>>
>>>>
>>>> and the abstraction layer here:
>>>>
>>>>
>>>> https://github.com/BlueGoliath/Crosspoint/tree/master/src/main/java/org/goliath/crosspoint
>>>>
>>>>
>>>>
>>>> I'm able to read/write other structs just fine, such as:
>>>>
>>>>
>>>> https://github.com/BlueGoliath/java-nvidia-bindings/blob/master/modules/org.goliath.bindings.nvctrl/src/main/java/org/goliath/bindings/nvctrl/structs/NVCTRLAttributeValidValuesRec.java
>>>>
>>>>
>>>>
>>>> and again, all byte sizes seem correct(48 bytes for the NVML
>>>> struct), so I'm really lost here.
>>>>
>>>>
>>>>
>>>> [1] https://imgur.com/a/wrQtOXq
>>>>
>>>> [2]
>>>> https://docs.nvidia.com/deploy/nvml-api/group__nvmlGridQueries.html#group__nvmlGridQueries_1gb0ea5236f5e69e63bf53684a11c233bd
>>>>
More information about the panama-dev
mailing list