Question on String#indexOf(String)

Rémi Forax forax at univ-mlv.fr
Tue Apr 28 12:38:24 UTC 2009


Jim Andreou a écrit :
> Answering my own question, probably most (all?) faster algorithms seem 
> to need memory proportional to the size of the alphabet, which is kind 
> of huge for Unicode, so that could be the reason.
No see:
http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm

And this algorithmm is currently implemented by the regex package 
java.util.regex.

Rémi
>
> 2009/4/28 Jim Andreou <jim.andreou at gmail.com 
> <mailto:jim.andreou at gmail.com>>
>
>     Hi,
>
>     I wonder why String#indexOf(String) is implemented as it is.
>     Apparently, when a character mismatch with the searched pattern is
>     found, the pattern is only shifted by one character, but there are
>     faster algorithms, for example
>     see http://www.cs.utexas.edu/users/moore/best-ideas/string-searching/index.html.
>     Was anything smarter tried out but had significant disadvantages
>     for general use? What advantages does the current implementation
>     have? It looks very pessimistic.
>
>     Regards,
>     Dimitris Andreou
>
>




More information about the core-libs-dev mailing list