>>So you have two strings - one with say 3,000 words and another of about 20K with x words. If the likelihood of words in the 'search for' string not appearing in the 'search in' string is fairly high you could use linq to obtain a list of words that *did* occur and use that as a basis for further manipulation. Something like:
static public List<string> WordsInTarget(string searchFor, string searchIn)
>> {
>> List<string> s = (from x in searchIn.Split() select x).ToList();
>> return (from x in searchFor.Split().Where(x => s.Contains(x)) select x).Distinct().ToList();
>> }
No idea how it would compare speed-wise tho. And Split() might need some parameters....
>
>The split is already implemented. The optimization will probably move up one layer when using SubString().
I assume you mean that you are already splitting the 'search for' string? But the benefit above comes from splitting the 'search in' string as well. You end up with a short (well maybe short) list of words that exist in both. At that point all other words in the 'search for' string are irrelevant.
Oh, and I still don't understand how or where you are using SubString() [ or Mid() ] in all this ?
Maybe if you posted a couple of representative strings it would be easier to guess the best approach......