Propose to use clang-format to enforce hotspot codestyle

Liu, Xin xxinliu at amazon.com
Thu Mar 12 09:30:08 UTC 2020


Thanks for the feedbacks.

For Andrew’s concerns, I think I got answers.
Sometimes, hand-craft code is better indeed.  Clang-format provides a directive which can temporarily turn it off.
https://releases.llvm.org/6.0.1/tools/clang/docs/ClangFormatStyleOptions.html#disabling-formatting-on-a-piece-of-code
Developers can also define customized formats. clang-format takes precedence the local configure file over parent directory’s.  Assume that GC  guys would like to use an exotic code style, they can define it and place it in their own directory.

I expect to have a huge change, but it’s a one-time thing.  I don’t understand why we have to review it completely.  A formatter should never alter semantics of user’s code.  If hotspot can’t be built or spot any regression, there must be something wrong. We should be glad some bugs are caught by an automatic tool.

Yes, clang-format is a formatter,  but we can turn it into a validator very cheap by git/hg. If we run clang-format and generate any diff, we know the current code violates hotspot’s codestyle. In my proposal, I plan to do that in phase-2, eg. “make check-style”.  Of course, it only works in a clean repo.

How about we just forget about validator. I think we need a .clang-format first. It helps developers to format code automatically. To proof my idea, I made a draft .clang-format and run it against compiler/*.cpp.  I think most of those changes make hotspot more readable.
https://cr.openjdk.java.net/~xliu/8240834/webrev/

I don’t realize that hotspot macros are special. It seems that clang-format has many macro-related styles too.  Let me try them on globals.hpp.

Thanks,
--lx


From: Jesper Wilhelmsson <jesper.wilhelmsson at oracle.com>
Date: Wednesday, March 11, 2020 at 11:18 AM
To: "Liu, Xin" <xxinliu at amazon.com>
Cc: Volker Simonis <volker.simonis at gmail.com>, HotSpot Open Source Developers <hotspot-dev at openjdk.java.net>
Subject: RE: [EXTERNAL]Propose to use clang-format to enforce hotspot codestyle

On 11 Mar 2020, at 13:06, Volker Simonis <volker.simonis at gmail.com<mailto:volker.simonis at gmail.com>> wrote:

Liu, Xin <xxinliu at amazon.com<mailto:xxinliu at amazon.com>> schrieb am Mi., 11. März 2020, 09:50:


Hi,

I just filed a JBS issue JDK-8240834 about hotspot codestyle(
https://wiki.openjdk.java.net/display/HotSpot/StyleGuide). I found many
inconsistent code in hotspot so I propose to use clang-format[1] to enforce
it.

I know you’re all very experienced developers, but sometimes wrong
indentations or tailing whitespaces creep in.  it's hard to catch them in
webrev.  Even though we can, new contributors still need to contact their
sponsors for formatting changes.   I think it's a good idea to have lint
thing done at first place.

Clang is powerful enough to parse modern C/C++ code and it's very handy to
format code. clang-format should be available on all supporting platforms.
It's still okay even if it is absent on some platforms because nightly
build and submit repo will cover unsupported platforms.

I plan to have  a .clang-format and correcting styles in the 1st patch. We
can bring auto clang-format check to hotspot's makefile in the second
patch.  Does it sound like a plan?

Have you already checked how big this first change will be? I'm afraid it
will be huge because we have a lot of manually formatted code which doesn't
confirm to any specific style (e.g. macros). We also have consistent but
different code styles in various parts of HS. So changing all the sources
to conform to a single style will necessarily be quite invasive and I'm not
sure if we want that?


In theory I support using a formatting tool to maintain a single code style throughout the hotspot code, but in practice this is extremely difficult. Volker brings up one point here, the situation today is bad - really bad - from a formatter's perspective. The size of the initial change will be huge. To review that requires people from all affected areas to look in detail on their code to make sure nothing is broken.  Andrew brings up a different point in his mail; parts of the code is manually formatted for a reason. No automatic formatter will be able to make sense of all code if it applies the same rules all over.

I have looked into this problem for quite some time and I believe we have to take a very slow approach to this. Not only to be able to actually review the changes, but also to keep our developers around. First, any changes we make to clean up formatting must be automatically verified from that point forward to stop people from pushing new code that violates the rules we have managed to agree on. This requires a tool that can enable individual checks one by one. Secondly, the tool can't go in and modify people's code, it can only verify that it is correct and give a warning if there is a violation. This means that what we are looking for is a validator, not a formatter. There must also be the option to disable validation for parts of the code.

I have not found any such tools out there, one that can accept the hotspot source code as is without any remarks if all checks are disabled, and then let you enable one check a time as we gradually clean up the code. The best I could find was uncrustify, but it still is a formatter, not a validator, and there are a few style rules that can't be disabled. I asked the developers of uncrustify if there was any chance of having the option for it to be a validator instead but that didn't seem to be in their interest.

As I didn't find one I started to implement my own validator a few years ago. I do have a tool that swallows most of the hotspot code as is, but there are a few issues with macro usages. You see, that is another issue we have to deal with. It's not enough to parse C++. If we want to read and format the code we need to treat our macros as part of the language as well. Hotspot has a few macros that make for real challenges for a parser :-)  I haven't yet included the option to disable verification for a block of code, coming to think of it, that could actually solve the most weird macro usages we have out there. I'll take another look at that at some point.

Until we have a tool of this kind I'm afraid it will be extremely difficult to enforce any style rules in practice. But I do agree that this is a goal worth pursuing.

Cheers,
/Jesper





Reference:
[1] https://clang.llvm.org/docs/ClangFormat.html
Thanks,
--lx




More information about the hotspot-dev mailing list