Proposal: UPSTREAM.md -- better tracking of upstream code in the JDK
Kevin Rushforth
kevin.rushforth at oracle.com
Thu Apr 21 19:35:50 UTC 2022
I like the idea as long as we avoid duplication. Each third-party
library already has a required xxxx.md file in src/<module>/share/legal.
Likewise, some modules have "UPDATING" instructions in the component
itself (which, I think, is where it belongs). If each entry in this
aggregate UPSTREAM.md file were limited to the name of the component
(not its version), the location of the md file, the location of the
UPDATING instructions (if any), and the location of the source code
(preferably as a dir or list of dirs), that seems workable.
-- Kevin
On 4/21/2022 12:13 PM, daniel.daugherty at oracle.com wrote:
> > Thoughts?
>
> I like this idea. It will also benefit code archaeologists and
> spelunkers.
>
> Dan
>
>
> On 4/21/22 2:58 PM, Magnus Ihse Bursie wrote:
>> The JDK project depends on many different open source projects. Some
>> of them are linked to as libraries at runtime, but others have their
>> source code directly incorporated into our source tree, known as "3rd
>> party code".
>>
>> Unfortunately, the haphazard way this code is sprinkled throughout
>> our code base makes it very hard to tell at a glance if some code
>> originated with the JDK project, or is imported from elsewhere
>> ("upstream"). Many times, you need to be well acquainted with these
>> parts of the code to know whether a file is 3rd party code or not. If
>> you do not know, you will need to rely on heuristics such as looking
>> at the path name, checking for unusual copyright headers, or looking
>> at the git history for commits that indicate a refresh from upstream.
>>
>> I propose we do something about this situation.
>>
>> My suggestion is that we add a file, UPSTREAM.md, in the top
>> directory of the imported 3rd party code. These files will follow a
>> pattern, with a set of formalized headers on the top, a blank line of
>> separation, and then a free-form markdown text, with e.g. relevant
>> notes about the project, important information about the latest
>> update, or instructions or hints on how to update the source to a
>> newer version.
>>
>> Here are two examples on how this might look. (Note that the
>> free-form text here is just some offhand examples I invented. In real
>> life I assume they would be more detailed.)
>>
>> Example 1: src/java.xml.crypto/share/classes/com/sun/UPSTREAM.md:
>> ===
>> Name: Apache Santuario
>> Homepage: https://santuario.apache.org/
>> License: src/java.xml.crypto/share/legal/santuario.md
>> Version: 2.2.1
>> Upstream-release-URL:
>> https://github.com/apache/santuario-xml-security-java/releases/tag/xmlsec-2.2.1
>>
>> # Upgrade instructions
>>
>> To upgrade the package, copy the source code from
>> `src/main/java/org/apache` in the upstream git repo into
>> `src/java.xml.crypto/share/classes/com/sun/org/apache`. Then update
>> the package name space by running `find
>> src/java.xml.crypto/share/classes/com/sun/org/apache | xargs sed -e
>> 's/^package org\.apache/package com.sun.org.apache/'`.
>> ===
>>
>> Example 2: src/java.desktop/share/native/libharfbuzz/UPSTREAM.md:
>> ===
>> Name: Harfbuzz
>> Homepage: https://harfbuzz.github.io/
>> License: src/java.desktop/share/legal/harfbuzz.md
>> Version: 2.8.0
>> Upstream-release-URL:
>> https://github.com/harfbuzz/harfbuzz/releases/tag/2.8.0
>>
>> # How to update
>>
>> To update to a new version of Harfbuzz, copy all `.cc`, `.hh` and
>> `.h` files from `src` into
>> `src/java.desktop/share/native/libharfbuzz`. Check if the build
>> scripts in upstream has changed since the last version, and update
>> our makefiles accordingly.
>> ===
>>
>>
>> These files will serve many purposes:
>>
>> 1) They will be a strong signal to developers coming to an unfamiliar
>> part of the code base that the files here originated upstream.
>>
>> 2) It will be possible for tooling to understand that code in these
>> directories might not live up to normal JDK standards. It would e.g.
>> be possible for the build system to automatically disable
>> warnings-as-errors for such code, or for upcoming tools that support
>> code quality efforts such as blessed modifier order or spell checks
>> to skip those parts of the code.
>>
>> 3) It will be possible to get an at-a-glance overview of what
>> versions of 3rd party code are included in a build of the JDK, for
>> all included projects -- not just as of right now, but at any point
>> in history (since these files gets updated when upstream code is
>> updated in the JDK). The build system could, for instance, collect
>> such information and provide it with the built JDK, just as it now
>> collects the licenses from the src/$MODULE/legal directories.
>>
>> 4) The git history for these files will clearly show when the code
>> were last refreshed from upstream, and by whom.
>>
>> 5) And finally, the free-text part gives a well-defined place to
>> store important information about how to upgrade, common mistakes,
>> etc -- knowledge that right now sometimes is put down into README
>> files, but most often just resides in the head of the developer who
>> last did a refresh.
>>
>> Thoughts?
>>
>> /Magnus
>
More information about the jdk-dev
mailing list