Efficient Text Searching in Java: Finding the Right String in Any Language
By Laura Werner2003-05-25
I hope this article has given you a good idea of how you can use collators to add language-sensitive sorting and searching to your own Java applicatio
The patented technique for applying the Boyer-Moore search algorithm to collation elements was developed by Dr. Mark Davis of IBM. Kathleen Wilson, the manager of the text and international groups at IBM's Center for Java Technology in Silicon Valley, was very indulgent of the time I spent working on this article and the accompanying code. I would also like to thank Mark, Kathleen, Michael Pogue, John Raley, and Helena Shih for reviewing drafts of this article.
This article previously appeared in the February, 1999, issue of Java Report magazine and was presented at the 14th International Unicode Conference in March, 1999.
Tutorial Pages:
» Finding the right string in any language
» Text searching and sorting is one of the most well researched areas in computer science. It is covered in an introductory algorithms course in nearly
» Under the Hood
» Text Searching in JDK 1.1
» Ignore That Character!
» It's Better in 1.2
» Optimized Searching
» Boyer-Moore and Unicode
» But wait! At the very beginning of this article, I said that this kind of algorithm doesn't work well with Unicode because it has 65,535 possible char
» I hope this article has given you a good idea of how you can use collators to add language-sensitive sorting and searching to your own Java applicatio
First published by IBM DeveloperWorks
