RFR: 8026062 : (s) webrev.ksh: fix bug title web scraping, remove teamware, sac, "open bug" and wxfile support
Mike Duigou
mike.duigou at oracle.com
Thu Oct 10 05:17:20 UTC 2013
On Oct 9 2013, at 19:45 , John Rose wrote:
> On Oct 9, 2013, at 3:54 PM, Mike Duigou <mike.duigou at oracle.com> wrote:
>
>> Hello all;
>>
>> This changeset revisits webrev's integration with bugs.openjdk.java.net to correct problems with the scraping of bug titles from jbs html. I also eliminated the now obsolete "-O" option since bugs are now all visible on the preferred bugs.openjdk.java.net system.
>>
>> While I was at it I finally eliminated the obsolete teamware, sac and wxfile support with the hopes that cruft removal may lower the bar to others contributing to webrev (yeah, it's not glamourous but we still need it).
>>
>> http://cr.openjdk.java.net/~mduigou/JDK-8026062/0
>
> Great deletions!
It does feel good!
> Here's my "private reserve" version of webrev's html_quote, some of which you may want to use:
>
> #
> # Make a piece of source code safe for display in an HTML <pre> block.
> +# If the line appears to start with an HTML tag, assume it is markup and don't quote on it.
> #
> html_quote()
> {
> - sed -e "s/&/\&/g" -e "s/</\</g" -e "s/>/\>/g" "$@" | expand
> + sed -e '/^<[a-zA-Z][^<]*[a-zA-Z_0-9\/"'\'']>/{p;d;}' \
> + -e '/^&#*[a-zA-Z_0-9]*;/{p;d;}' \
> + -e "s/&/\&/g" -e "s/</\</g" -e "s/>/\>/g" \
> + "$@" \
> + | expand
> }
You also have one for numeric character entities at the start of a line. I've opted to fix my numeric entities replacer (it was definitely broken, thanks!).
I'm reluctant to add yours since I don't have test data to see if I like the effects. If you can point me to some I'd certainly consider it.
> A few points about it:
>
> If a line appears to already have HTML in it (heuristically!), it goes completely unquoted. (This makes it easy to add markup to "include" files.)
> The sed code for & handles consecutive rewrites, as "&&" or "<&>", which yours does not appear to do.
Yes, it was definitely broken.
> My Mac's version of sed does not appear to rewrite "&" to "&" using your code.
My code was only working accidentally. I had failed to escape the writing of & in the replacement which meant it was writing the matched expression. eek!
> — John
>
> P.S. Like most folks, I have other tweaks in my webrev, but they would be out of scope for this change.
>
> Here is the way I use webrev (in the setting of a workflow script):
> webrev-from-openjdk $force_mercurial -u $WEBREV_USER -o $WEBREV_DIR -N -r -2
>
> The "off by one" revision is very important. It selects the parent of the revision I want to review. I use mq to manage changes all the way through review, so "hg outgoing" is irrelevant to me. Instead, I "hg qgoto" the change I want to review, to make it the tip, and then review against -2. This seems to me far safer and more flexible (and with fewer Merge changesets) than the alternatives.
I use similar. It is bit of black magic to be using the "-2". Now that hg has "secret" changesets (for mq) I'd prefer to be automatically detect whether the changeset being generated is; outgoing (current default), the qapplied, or uncommitted (-N). I would prefer to have it refuse to support any mixed cases. I liked the work done by Jim Gish to produce real changesets (only to lose it with Mercurial 2.1 when MQ patches became secret changesets).
> I also have a push-button workflow script which can generate the next (00, 01, 02, ...) minor webrev version for a given bug and push it to cr.ojn.
I've been meaning to make a script wrapper for the shell commands I use.
ksh ../make/scripts/webrev.ksh -m -u mduigou -c `hg qtop` -r -3 -o ~/`hg qtop`/0
hg log -r -1:qtip --template "{desc}\n" | grep "^[0-9]\{7,7\}: " | tee ~/`hg qtop`/.title
automated directory incrementing would be a nice addition.
> It seems like that should be easy to do with an OpenJDK savvy webrev.
I keep hoping something will come along to replace webrev....
I will update my webrev after some more testing.
Thanks!
Mike
More information about the build-dev
mailing list