[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Case insensitive lookfor()
From: |
Rik |
Subject: |
Re: Case insensitive lookfor() |
Date: |
Fri, 28 Oct 2011 10:33:31 -0700 |
On 10/27/2011 11:35 PM, Søren Hauberg wrote:
> tor, 27 10 2011 kl. 20:07 -0700, skrev Rik:
>> All,
>>
>> I chanced upon an issue with the lookfor () command today. I expected it
>> to find the exact string I was looking for, but instead it uses search_str
>> = lower (input_str). Presumably this new search_str was meant to be
>> compared against an all lower case version of the help string. However,
>> this is not what is in the doc-cache; only the true help text with
>> capitalization is in the doc-cache.
>>
>> Example:
>> get_first_help_sentence ("ls")
>> ans = List directory contents.
>> lookfor ("List directory")
>> <nothing>
>> lookfor ("ist directory")
>> ls List directory contents.
>>
>> The lower() call makes it impossible to find certain keywords, often Matlab
>> ones, which use mixed case such as "UniformOutput". Offhand, I would vote
>> for just removing the lower function call. However, if everyone is really
>> keen to have case insensitive matching then we could expand doc-cache to
>> store a lower-case version of all of the help text. This would double the
>> current doc-cache size from 1.3 MB to 2.6 MB.
> I think we, by default, should be making a case-insensitive search, i.e.
> convert both help text and search term to lower case before comparison.
> We don't have to store the help text in lower case to do this, we can
> just convert it after loading it. It would then be nice to also be able
> to do
>
> lookfor -case-sensitive my_search_term
>
> which would disable any conversion to lower case. I think any
> case-conversion can easily happen at run-time; no need to store anything
> but the original text in the cache.
You're right. I was thinking of doing a time/space tradeoff but it's not
worth it. I just benchmarked two lookfor searches--one with converting the
cache to lower case and the other as is--and the difference is only 10 ms.
I can live with that. I'm going to make the change to case insensitive
searching on the development branch. Eventually, adding the -case
sensitive option can be implemented.
--Rik