Case insensitive search #132

Open
opened 3 years ago by sloum · 1 comments
sloum commented 3 years ago
Owner

Currently the search function initiated by / is a verbatim match. It has been suggested by user natpen that providing an option for case insensitive search (or just defaulting to it) would be a better way to go.

I am hesitant to add a full regex engine to Bombadillo, but case insensitive search should be very doable.

Currently the search function initiated by `/` is a verbatim match. It has been suggested by user `natpen` that providing an option for case insensitive search (or just defaulting to it) would be a better way to go. I am hesitant to add a full regex engine to Bombadillo, but case insensitive search should be very doable.
sloum added the enhancement non-urgent labels 3 years ago
sloum commented 3 years ago
Poster
Owner

There is a version of this up currently on the branch case-insensitive-search. I do not like that this adds a little bit more weight to a part of the render chain/flow. However, it seems to be pretty negligible overhead (mostly the cost is converting every line in a file to lowercase for the search).

I think there may be a slight problem with this version of case insensitivity: it will find case insensitive results on each line just fine. However, if there are two case versions of a word on a line it will only find the first one. So, for example:

The cat chased the dog.

In that brief bit of text the word the appears twice. Once at the beginning of the sentence and once toward the end. The first occasion has a capital T. When the match highlighting is done only The will be highlighted. Similarly:

[ ... ]
chased the dog. The cat was fine.

In the above example the first instance (the) would be highlighted, but not The.

This has to do with the way search is carried out: Searching is done by finding the index of a word on each line. To add case insensitivity I just converted the search and the line to lower case when finding an index. I then take the index and add the length of the word to get the substring that was found and do a replace on the original line for all isntances of that substring to have the highlighting. As a result, it only highlights the matching word and not both versions even though if they appeared on separate lines they would both be highlighted.

I am, for the moment, ok with this compromise. I still think it is better than adding regex. The search functionality will still get you to any line with a match, leaving the user to visually grep the line as needed.

I'll wait a bit to create a PR on this so that it can sink in a bit and see how folks feel about it over a brief interval.

There is a version of this up currently on the branch `case-insensitive-search`. I do not like that this adds a little bit more weight to a part of the render chain/flow. However, it seems to be pretty negligible overhead (mostly the cost is converting every line in a file to lowercase for the search). I think there may be a slight problem with this version of case insensitivity: it will find case insensitive results on each line just fine. However, if there are two case versions of a word on a line it will only find the first one. So, for example: ``` The cat chased the dog. ``` In that brief bit of text the word `the` appears twice. Once at the beginning of the sentence and once toward the end. The first occasion has a capital `T`. When the match highlighting is done only `The` will be highlighted. Similarly: ``` [ ... ] chased the dog. The cat was fine. ``` In the above example the first instance (`the`) would be highlighted, but not `The`. This has to do with the way search is carried out: Searching is done by finding the index of a word on each line. To add case insensitivity I just converted the search and the line to lower case when finding an index. I then take the index and add the length of the word to get the substring that was found and do a replace on the original line for all isntances of that substring to have the highlighting. As a result, it only highlights the matching word and not both versions even though if they appeared on separate lines they would both be highlighted. I am, for the moment, ok with this compromise. I still think it is better than adding regex. The search functionality will still get you to any line with a match, leaving the user to visually grep the line as needed. I'll wait a bit to create a PR on this so that it can sink in a bit and see how folks feel about it over a brief interval.
sloum self-assigned this 3 years ago
sloum added the in progress label 3 years ago
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date

No due date set.

Dependencies

No dependencies set.

Reference: sloum/bombadillo#132
Loading…
There is no content yet.