P.S.: RFR [9] 8133651: automated replacing of old-style tags in docs
Roger Riggs
Roger.Riggs at Oracle.com
Thu Oct 1 13:00:59 UTC 2015
+1 for manageable sized and per repo code-reviews.
(unless someone has a tool that checks the patch to confirm that the
only difference between
the old and the new is the same small number of substitutions; another
quick program perhaps...).
On 10/1/2015 8:40 AM, Lance Andersen wrote:
> Hi Alexander,
>
> Personally, I would find it easier to review if the changes were sent out by category/module vs one large patch as it is easier to miss things.
>
> Also for technologies such as jaxws, these will need to go to the external workspace at the same time for the technology so these should definitely be kept separate.
>
> Best
> Lance
> On Oct 1, 2015, at 7:31 AM, Alexander Stepanov <alexander.v.stepanov at oracle.com> wrote:
>
>>
>> So if you don't have objections, I'll delay for a several days and then publish a final RFR (probably containing changes in some other repos like jaxws, corba or jaxp) which would be more formal (containing bug # and the final specdiff report).
>>
>> Thanks again,
>> Alexander
>>
>>
>> On 10/1/2015 9:54 AM, Martin Buchholz wrote:
>>> Hi s'marks,
>>> You probably don't need to absolutify paths.
>>> And you can easily handle multiple args.
>>>
>>> (just for fun!)
>>> Checks for javadoc comment; handles popular html entities; handles multiple lines; handles both tt and code:
>>>
>>> #!/bin/bash
>>> find "$@" -name '*.java' | \
>>> xargs -r perl -p0777i -e \
>>> 'do {} while s~^ *\*.*\K<(tt|code)>((?:[^<>{}\&\@]|&(?:lt|gt|amp);)*)</\1>~$_=$2; s/</</g; s/>/>/g; s/&/&/g; "{\@code $_}"~mgie'
>>>
>>>
>>> On Wed, Sep 30, 2015 at 6:16 PM, Stuart Marks <stuart.marks at oracle.com <mailto:stuart.marks at oracle.com>> wrote:
>>>
>>> Hi Alexander, Martin,
>>>
>>> The challenge of Perl file slurping and Emacs one-liners was too
>>> much to bear.
>>>
>>> This is Java, so one-liners are hardly possible. Still, there are
>>> a bunch of improvements that can be made to the Java version. (OK,
>>> and I'm showing off a bit.)
>>>
>>> Take a look at this:
>>>
>>> http://cr.openjdk.java.net/~smarks/misc/SimpleTagEditorSmarks1.java <http://cr.openjdk.java.net/%7Esmarks/misc/SimpleTagEditorSmarks1.java>
>>>
>>> I haven't studied the output exhaustively, but it seems to do a
>>> reasonably good job for the common cases. I ran it over java.lang
>>> and I noticed a few cases where there is markup embedded within
>>> <code></code> text, which should be looked at more closely.
>>>
>>> I don't particularly care if you use my version, but there are
>>> some techniques that I'd strongly recommend that you consider
>>> using in any such tool. In particular:
>>>
>>> - Pattern.DOTALL to do multi-line matches
>>> - Pattern.CASE_INSENSITIVE
>>> - try-with-resources to ensure that files are closed properly
>>> - NIO instead of old java.io <http://java.io> APIs, particularly
>>> Files.walk() and streams
>>> - use Scanner to deal with input file buffering
>>> - Scanner's stream support (I recently added this to JDK 9)
>>>
>>> Enjoy,
>>>
>>> s'marks
>>>
>>>
>>>
>>> On 9/29/15 2:23 PM, Martin Buchholz wrote:
>>>
>>> Hi Alexander,
>>>
>>> your change looks good. It's OK to have manual corrections
>>> for automated
>>> mega-changes like this, as long as they all revert changes.
>>>
>>> Random comments:
>>>
>>> Should you publish your specdiff? I guess not - it would be
>>> empty!
>>>
>>> while((s = br.readLine()) != null) {
>>>
>>> by matching only one line at a time, you lose the ability to make
>>> replacements that span lines. Perlers like to "slurp" in the
>>> entire file
>>> as a single string.
>>>
>>> s = s.replace( "<CODE>", tag1);
>>> s = s.replace( "<Code>", tag1);
>>> s = s.replace("</CODE>", tag2);
>>> s = s.replace("</Code>", tag2);
>>>
>>> Why not use case-insensitive regex?
>>>
>>> Here's an emacs-lisp one-liner I've been known to use:
>>>
>>> (defun tt-code ()
>>> (interactive)
>>> (query-replace-regexp
>>> "<\\(tt\\|code\\)>\\([^&<>\\\\]+\\)</\\1>" "{@code
>>> \\2}"))
>>>
>>> With more work, one can automate transformation of embedded
>>> things like <
>>>
>>> But of course, it's not even possible to transform ALL uses of
>>> <code> to
>>> {@code, if there was imaginative use of nested html tags.
>>>
>>>
>>> On Tue, Sep 29, 2015 at 3:21 AM, Alexander Stepanov <
>>> alexander.v.stepanov at oracle.com
>>> <mailto:alexander.v.stepanov at oracle.com>> wrote:
>>>
>>> Updated: a few manual corrections were made (as @linkplain
>>> tags displays
>>> nested {@code } literally):
>>> http://cr.openjdk.java.net/~avstepan/tmp/codeTags/jdk.patch <http://cr.openjdk.java.net/%7Eavstepan/tmp/codeTags/jdk.patch>
>>> -checked with specdiff (which of course does not cover
>>> documentation for
>>> internal packages), no unexpected diffs detected.
>>>
>>> Regards,
>>> Alexander
>>>
>>>
>>> On 9/27/2015 4:52 PM, Alexander Stepanov wrote:
>>>
>>> Hello Martin,
>>>
>>> Here is some simple app. to replace <code></code> tags
>>> with a new-style
>>> {@code } one (which is definitely not so elegant as
>>> the Perl one-liners):
>>> http://cr.openjdk.java.net/~avstepan/tmp/codeTags/SimpleTagEditor.java
>>> <http://cr.openjdk.java.net/%7Eavstepan/tmp/codeTags/SimpleTagEditor.java>
>>>
>>> Corresponding patch for jdk and replacement log (~62k
>>> of the tag changes):
>>> http://cr.openjdk.java.net/~avstepan/tmp/codeTags/jdk.patch
>>> <http://cr.openjdk.java.net/%7Eavstepan/tmp/codeTags/jdk.patch>
>>> http://cr.openjdk.java.net/~avstepan/tmp/codeTags/replace.log
>>> <http://cr.openjdk.java.net/%7Eavstepan/tmp/codeTags/replace.log>
>>> (sorry, I have to check the correctness of the patch
>>> with specdiff yet,
>>> so this is rather demo at the moment).
>>>
>>> Don't know if these changes (cosmetic by nature) are
>>> desired for now or
>>> not. Moreover, probably some part of them should go to
>>> another repos (e.g.,
>>> awt, swing -> "client" instead of "dev").
>>>
>>> Regards,
>>> Alexander
>>>
>>>
>>>
>>> ----- Исходное сообщение -----
>>> От: alexander.v.stepanov at oracle.com
>>> <mailto:alexander.v.stepanov at oracle.com>
>>> Кому: martinrb at google.com <mailto:martinrb at google.com>
>>> Копия: core-libs-dev at openjdk.java.net
>>> <mailto:core-libs-dev at openjdk.java.net>
>>> Отправленные: Четверг, 24 Сентябрь 2015 г 16:06:56 GMT
>>> +03:00 Москва,
>>> Санкт-Петербург, Волгоград
>>> Тема: Re: RFR [9] 8133651: replace some <tt> tags
>>> (obsolete in html5) in
>>> core-libs docs
>>>
>>> Hello Martin,
>>>
>>> Thank you for review and for the notes!
>>>
>>> > I'm biased of course, but I like the approach I
>>> took with
>>> blessed-modifier-order:
>>> > - make the change completely automated
>>> > - leave "human editing" for a separate change
>>> > - publish the code used to make the automated
>>> change (in my case,
>>> typically a perl one-liner)
>>>
>>> Automated replacement has an obvious advantage: it is
>>> fast and massive.
>>> But there are some disadvantages at the same time
>>> (just IMHO).
>>>
>>> Using script it is quite easy to miss some not very
>>> trivial cases, e.g.:
>>> - remove unnecessary linebreaks, like
>>> * <tt>someCode
>>> * </tt>
>>> (which would be better to replace with single-line
>>> {@code someCode};
>>> - joining of successive terms, like "<tt>ONE</tt>,
>>> <tt>TWO</tt>,
>>> <tt>THREE</tt>" -> "{@code ONE, TWO, THREE}";
>>> - errors like extra or missing "<" or ">": *
>>> <tt>Collection
>>> <T></tt>", - there were a lot of them;
>>> - some cases when <tt></tt> should be replaced with
>>> <code></code>, not
>>> {@code } (e.g. because of unicode characters inside of
>>> code etc.);
>>> - extra tags inside of <tt> or <code> which should be
>>> moved outside of
>>> {@code }, like <tt><i>someCode</i></tt> or
>>> <tt><b>someCode</b></tt>;
>>> - simple removing of needless tags, like "<tt>{@link
>>> ...}</tt>" ->
>>> "{@link ...}";
>>> - replace HTML codes with symbols ('<', '>', '@', ...)
>>> - etc.
>>> - plus some other formatting changes and fixes for
>>> misprints which would
>>> be omitted during the automated replacement (and
>>> wouldn't be done in
>>> future manually because there is no motivation for
>>> repeated processing).
>>>
>>> So sometimes it may be difficult to say where is the
>>> border between
>>> "trivial" and "human-editing" cases (and the portion
>>> of "non-trivial
>>> cases" is definitely not minor); moreover, even the
>>> automated
>>> replacement requires the subsequent careful review
>>> before publishing of
>>> webrev (as well as by reviewers who probably wouldn't
>>> be happy to review
>>> hundreds of files at the same time) and iterative
>>> checks/corrections.
>>> specdiff is very useful for this task but also cannot
>>> fully cover the
>>> diffs (as some changes are situated in the internal
>>> com/... sun/...
>>> packages).
>>>
>>> Moreover, I'm sure that some reviewers would be
>>> annoyed with the fact
>>> that some (quite simple) changes were postponed
>>> because they are "not
>>> too trivial to be fixed just now" (because they will
>>> suspect they would
>>> be postponed forever). So the patch creator would
>>> (probably) receive
>>> some advices during the review like "please fix also
>>> fix this and that"
>>> (which is normal, of course).
>>>
>>> So my preference was to make the changes package by
>>> package (in some
>>> reasonable amount of files) not postponing part of the
>>> changes for the
>>> future (sorry for these boring repeating review
>>> requests). Please note
>>> that all the above mentioned is *rather explanation of
>>> my motivation
>>> than objection* :) (and of course I used some text
>>> editor replace
>>> automation which is surely not so advanced as Perl).
>>>
>>> > It's probably correct, but I would have left it
>>> out of this change
>>> Yes, I see. Reverted (please update the web page):
>>> http://cr.openjdk.java.net/~avstepan/8133651/jdk.00/index.html
>>> <http://cr.openjdk.java.net/%7Eavstepan/8133651/jdk.00/index.html>
>>>
>>> Thanks,
>>> Alexander
>>>
>>> P.S. The <tt> replacement job is mostly (I guess,
>>> ~80%) complete. But
>>> probably this approach should be used if some similar
>>> replacement task
>>> for, e.g., <code></code> tags would be planned in
>>> future (there are
>>> thousands of them).
>>>
>>>
>>> On 9/24/2015 6:10 AM, Martin Buchholz wrote:
>>>
>>>
>>> On Sat, Sep 19, 2015 at 6:58 AM, Alexander Stepanov
>>> <alexander.v.stepanov at oracle.com
>>> <mailto:alexander.v.stepanov at oracle.com>
>>> <mailto:alexander.v.stepanov at oracle.com
>>> <mailto:alexander.v.stepanov at oracle.com>>> wrote:
>>>
>>> Hello,
>>>
>>> Could you please review the following fix
>>> http://cr.openjdk.java.net/~avstepan/8133651/jdk.00/
>>> <http://cr.openjdk.java.net/%7Eavstepan/8133651/jdk.00/>
>>> <http://cr.openjdk.java.net/%7Eavstepan/8133651/jdk.00/>
>>> http://cr.openjdk.java.net/~avstepan/8133651/jaxws.00/index.html
>>> <http://cr.openjdk.java.net/%7Eavstepan/8133651/jaxws.00/index.html>
>>> <http://cr.openjdk.java.net/%7Eavstepan/8133651/jaxws.00/index.html
>>>
>>>
>>> for
>>> https://bugs.openjdk.java.net/browse/JDK-8133651
>>>
>>> Just another portion of deprecated <tt> (and
>>> <xmp>) tags replaced
>>> with {@code }. Some misprints were also fixed.
>>>
>>>
>>> I'm biased of course, but I like the approach I
>>> took with
>>> blessed-modifier-order:
>>> - make the change completely automated
>>> - leave "human editing" for a separate change
>>> - publish the code used to make the automated
>>> change (in my case,
>>> typically a perl one-liner)
>>>
>>>
>>> The following (expected) changes were
>>> detected by specdiff:
>>> - removed needless dashes in java.util.Locale,
>>> - removed needless curly brace in
>>> xml.bind.annotation.XmlElementRef
>>>
>>>
>>> I would do a separate automated "removed needless
>>> dashes" changeset.
>>>
>>>
>>> Please let me know if the following changes
>>> are desirable or not:
>>>
>>> http://cr.openjdk.java.net/~avstepan/8133651/jdk.00/src/jdk.jconsole/share/classes/sun/tools/jconsole/Formatter.java.udiff.html
>>> <http://cr.openjdk.java.net/%7Eavstepan/8133651/jdk.00/src/jdk.jconsole/share/classes/sun/tools/jconsole/Formatter.java.udiff.html>
>>> <
>>> http://cr.openjdk.java.net/%7Eavstepan/8133651/jdk.00/src/jdk.jconsole/share/classes/sun/tools/jconsole/Formatter.java.udiff.html
>>>
>>>
>>>
>>>
>>> This is an actual change to the behavior of this
>>> code - the
>>> maintainers of jconsole need to approve it. It's
>>> probably correct,
>>> but I would have left it out of this change. If
>>> you remove it, then I
>>> approve this change.
>>>
>>>
>>>
>>>
>
>
> Lance Andersen| Principal Member of Technical Staff | +1.781.442.2037
> Oracle Java Engineering
> 1 Network Drive
> Burlington, MA 01803
> Lance.Andersen at oracle.com
>
>
>
More information about the core-libs-dev
mailing list