You are basically talking about Boyer-Moore and Knuth-Morris-Pratt searching algorithms, of which there is a good discussion in Sedgewick. In practice, when there is a good chance of match failure on the first char. comparison, little is to be gained over the elementary techniques. The other methods are likely to gain on such examples as "duckbill" in the presence of many "ducktails" or "they" amidst many "the..." prefixes.
That's not really true. Boyer-Moore is typically O(M/N) where N is the search string length and M is the text length - that is the longer the search string, the quicker the match. It's also relatively easy to implement, since all you have to do is search checking the last character, and when it fails slide forward a precomputed amount based on the character in the text and the search string (ie, a table of characters is needed). Of coruse, in non-ASCII days, this is more challenging, but for UTF-8 you could probably treate all high bit set characters as equal and still get good results on mostly roman text.
There is some code I use in a shipping product below which you are welcome to adapt. It is quite old code, so the style is not something I'm proud of these days, but that's hardly unuusal. s is a string, already uppercased. It's reading through a file on disk so it is more complex than it needs to be if the content is in memory (and given how simple it is it should give you some idea that Boyer-Moore really is worthwhile for any case where a decent amount of text is expected).
That said, why not just add PCRE to your project. Its BSD licensed so should be useable with any project, and is relatively painless. Then you can do your search with the regexp /b\Qsearchstrin\E\b and it will get all the edge cases right for you, and handle unicode, and all you to offer regex searching too.
Enjoy, Peter.
function BoyerMooreFind: OSErr; { Boyer-Moore basic search } { From Algorithms by Robert Sedgewick, Second Edition, pg 287-288 } var skip: array[char] of integer; i, j: integer; pos, count: longInt; oe: OSErr; ch: char; begin for i := 0 to 255 do skip[chr(i)] := len; for i := 1 to len do skip[s[i]] := len - i; oe := GetFPos(ob.refnum, pos); for i := 1 to len do ob.buffer[i] := chr(0); i := len + 1; j := len; while (oe = noErr) and (j >= 1) and (pos < ob.filelen) do begin BlockMove(@ob.buffer[i - len], @ob.buffer, len - 1); i := len; count := SizeOf(OBBufferType) - len; if count + pos > ob.filelen then count := ob.filelen - pos; oe := FSRead(ob.refnum, count, @ob.buffer[len]); if (oe = noErr) & (count = 0) then oe := errGeneric; pos := pos + count; if oe = noErr then while (i <= count) and (j >= 1) do begin ch := UpperCaseChar(ob.buffer[i]); if ch = s[j] then begin i := i - 1; j := j - 1; end else begin if len - j + 1 > skip[ch] then begin i := i + len - j + 1; end else begin i := i + skip[ch]; end; j := len; end; end; end; if j >= 1 then begin if oe = noErr then oe := errGeneric; end else begin oe := SetFPos(ob.refnum, fsFromStart, pos - count + i); end; BoyerMooreFind := oe; end;