Proposal: UPSTREAM.md -- better tracking of upstream code in the JDK

daniel.daugherty at oracle.com daniel.daugherty at oracle.com
Thu Apr 21 19:13:25 UTC 2022


 > Thoughts?

I like this idea. It will also benefit code archaeologists and spelunkers.

Dan


On 4/21/22 2:58 PM, Magnus Ihse Bursie wrote:
> The JDK project depends on many different open source projects. Some 
> of them are linked to as libraries at runtime, but others have their 
> source code directly incorporated into our source tree, known as "3rd 
> party code".
>
> Unfortunately, the haphazard way this code is sprinkled throughout our 
> code base makes it very hard to tell at a glance if some code 
> originated with the JDK project, or is imported from elsewhere 
> ("upstream"). Many times, you need to be well acquainted with these 
> parts of the code to know whether a file is 3rd party code or not. If 
> you do not know, you will need to rely on heuristics such as looking 
> at the path name, checking for unusual copyright headers, or looking 
> at the git history for commits that indicate a refresh from upstream.
>
> I propose we do something about this situation.
>
> My suggestion is that we add a file, UPSTREAM.md, in the top directory 
> of the imported 3rd party code. These files will follow a pattern, 
> with a set of formalized headers on the top, a blank line of 
> separation, and then a free-form markdown text, with e.g. relevant 
> notes about the project, important information about the latest 
> update, or instructions or hints on how to update the source to a 
> newer version.
>
> Here are two examples on how this might look. (Note that the free-form 
> text here is just some offhand examples I invented. In real life I 
> assume they would be more detailed.)
>
> Example 1: src/java.xml.crypto/share/classes/com/sun/UPSTREAM.md:
> ===
> Name: Apache Santuario
> Homepage: https://santuario.apache.org/
> License: src/java.xml.crypto/share/legal/santuario.md
> Version: 2.2.1
> Upstream-release-URL: 
> https://github.com/apache/santuario-xml-security-java/releases/tag/xmlsec-2.2.1
>
> # Upgrade instructions
>
> To upgrade the package, copy the source code from 
> `src/main/java/org/apache` in the upstream git repo into 
> `src/java.xml.crypto/share/classes/com/sun/org/apache`. Then update 
> the package name space by running `find 
> src/java.xml.crypto/share/classes/com/sun/org/apache | xargs sed -e 
> 's/^package org\.apache/package com.sun.org.apache/'`.
> ===
>
> Example 2: src/java.desktop/share/native/libharfbuzz/UPSTREAM.md:
> ===
> Name: Harfbuzz
> Homepage: https://harfbuzz.github.io/
> License: src/java.desktop/share/legal/harfbuzz.md
> Version: 2.8.0
> Upstream-release-URL: 
> https://github.com/harfbuzz/harfbuzz/releases/tag/2.8.0
>
> # How to update
>
> To update to a new version of Harfbuzz, copy all `.cc`, `.hh` and `.h` 
> files from `src` into `src/java.desktop/share/native/libharfbuzz`. 
> Check if the build scripts in upstream has changed since the last 
> version, and update our makefiles accordingly.
> ===
>
>
> These files will serve many purposes:
>
> 1) They will be a strong signal to developers coming to an unfamiliar 
> part of the code base that the files here originated upstream.
>
> 2) It will be possible for tooling to understand that code in these 
> directories might not live up to normal JDK standards. It would e.g. 
> be possible for the build system to automatically disable 
> warnings-as-errors for such code, or for upcoming tools that support 
> code quality efforts such as blessed modifier order or spell checks to 
> skip those parts of the code.
>
> 3) It will be possible to get an at-a-glance overview of what versions 
> of 3rd party code are included in a build of the JDK, for all included 
> projects -- not just as of right now, but at any point in history 
> (since these files gets updated when upstream code is updated in the 
> JDK). The build system could, for instance, collect such information 
> and provide it with the built JDK, just as it now collects the 
> licenses from the src/$MODULE/legal directories.
>
> 4) The git history for these files will clearly show when the code 
> were last refreshed from upstream, and by whom.
>
> 5) And finally, the free-text part gives a well-defined place to store 
> important information about how to upgrade, common mistakes, etc -- 
> knowledge that right now sometimes is put down into README files, but 
> most often just resides in the head of the developer who last did a 
> refresh.
>
> Thoughts?
>
> /Magnus



More information about the jdk-dev mailing list