Proposal: UPSTREAM.md -- better tracking of upstream code in the JDK

Philip Race philip.race at oracle.com
Thu Apr 21 20:19:55 UTC 2022


A "marker" file indicating something is 3rd party that may be updated 
from time to time seems fine
but upgrading 3rd party libraries is already a pain so I'm not sure how 
prescriptive I'd want to
be about required content beyond simple basics.

In the client area we've started to add files called UPDATING.txt where 
we put the information
related to tasks when updating. Whilst some library might want to put 
that in an UPSTREAM.md
I'd want to have the option to just have one line saying "See 
UPDATING.txt for ..."

I'm not sure we really need to include the current version in there.
Then we'd perhaps be able to avoid updating this file every time.

BTW the true "upstream location" is more usually a site to download 
foo-1.2.3.tar.gz .. not some  repo tag.
We even have some open source 3rd party code for which you won't find a 
repo anywhere.

And I don't think it fair to call the locations of the upstream 
libraries "haphazard".
They are in the places they need to be, in many cases partly determined 
by the build team,
within the necessities of the modular JDK.

I'm curious what
"possible for the build system to automatically disable 
warnings-as-errors for such code"
means in practice.
Note that there are some cases where JDK "glue" code is co-mingled in 
the same directory,
so you'd have to refactor that if this were applied universally and 
always. And perhaps
we'd prefer to know about those warnings rather than just have them 
re-accumulate ..

-phil.

On 4/21/22 11:58 AM, Magnus Ihse Bursie wrote:
> The JDK project depends on many different open source projects. Some 
> of them are linked to as libraries at runtime, but others have their 
> source code directly incorporated into our source tree, known as "3rd 
> party code".
>
> Unfortunately, the haphazard way this code is sprinkled throughout our 
> code base makes it very hard to tell at a glance if some code 
> originated with the JDK project, or is imported from elsewhere 
> ("upstream"). Many times, you need to be well acquainted with these 
> parts of the code to know whether a file is 3rd party code or not. If 
> you do not know, you will need to rely on heuristics such as looking 
> at the path name, checking for unusual copyright headers, or looking 
> at the git history for commits that indicate a refresh from upstream.
>
> I propose we do something about this situation.
>
> My suggestion is that we add a file, UPSTREAM.md, in the top directory 
> of the imported 3rd party code. These files will follow a pattern, 
> with a set of formalized headers on the top, a blank line of 
> separation, and then a free-form markdown text, with e.g. relevant 
> notes about the project, important information about the latest 
> update, or instructions or hints on how to update the source to a 
> newer version.
>
> Here are two examples on how this might look. (Note that the free-form 
> text here is just some offhand examples I invented. In real life I 
> assume they would be more detailed.)
>
> Example 1: src/java.xml.crypto/share/classes/com/sun/UPSTREAM.md:
> ===
> Name: Apache Santuario
> Homepage: https://santuario.apache.org/
> License: src/java.xml.crypto/share/legal/santuario.md
> Version: 2.2.1
> Upstream-release-URL: 
> https://github.com/apache/santuario-xml-security-java/releases/tag/xmlsec-2.2.1
>
> # Upgrade instructions
>
> To upgrade the package, copy the source code from 
> `src/main/java/org/apache` in the upstream git repo into 
> `src/java.xml.crypto/share/classes/com/sun/org/apache`. Then update 
> the package name space by running `find 
> src/java.xml.crypto/share/classes/com/sun/org/apache | xargs sed -e 
> 's/^package org\.apache/package com.sun.org.apache/'`.
> ===
>
> Example 2: src/java.desktop/share/native/libharfbuzz/UPSTREAM.md:
> ===
> Name: Harfbuzz
> Homepage: https://harfbuzz.github.io/
> License: src/java.desktop/share/legal/harfbuzz.md
> Version: 2.8.0
> Upstream-release-URL: 
> https://github.com/harfbuzz/harfbuzz/releases/tag/2.8.0
>
> # How to update
>
> To update to a new version of Harfbuzz, copy all `.cc`, `.hh` and `.h` 
> files from `src` into `src/java.desktop/share/native/libharfbuzz`. 
> Check if the build scripts in upstream has changed since the last 
> version, and update our makefiles accordingly.
> ===
>
>
> These files will serve many purposes:
>
> 1) They will be a strong signal to developers coming to an unfamiliar 
> part of the code base that the files here originated upstream.
>
> 2) It will be possible for tooling to understand that code in these 
> directories might not live up to normal JDK standards. It would e.g. 
> be possible for the build system to automatically disable 
> warnings-as-errors for such code, or for upcoming tools that support 
> code quality efforts such as blessed modifier order or spell checks to 
> skip those parts of the code.
>
> 3) It will be possible to get an at-a-glance overview of what versions 
> of 3rd party code are included in a build of the JDK, for all included 
> projects -- not just as of right now, but at any point in history 
> (since these files gets updated when upstream code is updated in the 
> JDK). The build system could, for instance, collect such information 
> and provide it with the built JDK, just as it now collects the 
> licenses from the src/$MODULE/legal directories.
>
> 4) The git history for these files will clearly show when the code 
> were last refreshed from upstream, and by whom.
>
> 5) And finally, the free-text part gives a well-defined place to store 
> important information about how to upgrade, common mistakes, etc -- 
> knowledge that right now sometimes is put down into README files, but 
> most often just resides in the head of the developer who last did a 
> refresh.
>
> Thoughts?
>
> /Magnus



More information about the jdk-dev mailing list