Skip to content

Added support for fuzzy word search #1037

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

eliasdorneles
Copy link

Implemented "fuzzy" word search as presented in issue #858

@@ -1,5 +1,8 @@
# Chosen

**Note:** This is an experimental fork implementing "fuzzy" word search (as introduced in [this issue](https://github.com/harvesthq/chosen/issues/858)).
For the original Chosen plugin, check out https://github.com/harvesthq/chosen.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shoud be removed

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

@stof
Copy link
Contributor

stof commented Feb 14, 2013

the Prototype version should be updated as well. Both implementations should be kept in sync feature-wise

@eliasdorneles
Copy link
Author

@stof hey man, I've just updated the code with your recommendations! :)

@p-j
Copy link

p-j commented Feb 27, 2013

I tried your jquery file and on many case I had <em> parts appearing randomly in the text. especially when I erased part of my search. You may want to strip <em>s at the beginning of the function and add them back when you do the highlighting.

@eliasdorneles
Copy link
Author

@p-j hey, man! I'm not being able to reproduce your problem. Could you provide an example?

@p-j
Copy link

p-j commented Feb 28, 2013

@eliasdorneles, I whas not able to repro with your script only. Sorry for the false alarm.
I'm actually unable to repro in the exact same condition where yesterday I was 100% of the time.. I don't get it...

@eliasdorneles
Copy link
Author

@stof hey, man! Is there something else that is missing for this to go upstream?

@Olical
Copy link

Olical commented Apr 2, 2013

Just thought I should mention that I've been using this branch for a while now and it works really well. I, personally, haven't noticed anything wrong with it. It would be nice to have this in the core repository although I would imagine it would have to be behind a flag for backwards compatibility.

@pfiller pfiller mentioned this pull request May 14, 2013
@pavlovt
Copy link

pavlovt commented Jul 31, 2013

This version is working great!
I think in some cases chosen is not usable without fuzzy search.
Please add it to the master branch!

@tjschuck
Copy link
Member

I can't say when or if this would be merged, but can you please at least a.) rebase against master, and b.) squash this into one (or a few) logical commit(s) to remove a lot of the leftover dev-cycle cruft in the commits? Things like 5e809cb and d806b34 and d6f3a81 should be squashed out of the history.

Directions here if you're unfamiliar.

@eliasdorneles
Copy link
Author

@tjschuck thank you for the directions -- I'm not familiar with rebase yet, I'll give a try.

@eliasdorneles
Copy link
Author

@tjschuck well, I was able to squash it into one commit only -- that should make the revision a bit easier.
I'm not sure if I'm the right person to merge them into the current master, as there has been a lot of new commits in the recent months and I haven't had time to keep up.

@tjschuck
Copy link
Member

tjschuck commented Aug 1, 2013

@eliasdorneles If you rebase against master, you should then be up to date with what's happened there. Looking at https://github.com/eliasdorneles/chosen/commits/master, it appears that you're still quite a bit behind.

@eliasdorneles
Copy link
Author

Well, I meant to say that it was not that simple to rebase because the upstream code had several conflicting changes that were made after my initial pull request, and I had't been able to keep up.
So, today I redid the work against the current code base and cleaned up the code a little bit, I believe now it is easier to review it. :)

@eliasdorneles
Copy link
Author

Can someone please take a second look at this patch?
Or otherwise, le me know if there is a way I could package it so people can use it as an extension of vanilla Chosen?
Thanks, guys!

startpos = text.toLowerCase().indexOf word
while startpos >= 0
text = text.substr(0, startpos) + '<em>' + text.substr(startpos, word.length) + '</em>' + text.substr(startpos + word.length)
startpos = text.toLowerCase().indexOf(word, startpos + highlight_offset)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't you use startpos + highlight_offset + word.length to start matching only at the end of the occurrence we just found ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with that is that it would require the word parts to be in the same order. You see, it supports queries like "state uni" giving as results "United States" and also "Florida State University" -- this is useful when the user is not much familiar with the option names.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this startpos defined on this line is about searching the same word again later in the text, not about searching the next word.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, you're absolutely right -- sorry for my confusion! That makes sense, it should add word.length to get the next match.

@stof
Copy link
Contributor

stof commented Sep 3, 2013

I think the highlighting logic is flawed if you are searching for foo foobar as highlighting foo will find places already highlighted for foobar.
And I think it gets even worse when searching for r</e foobar (and as the search string comes from the user, there is no guarantee about what you get).

regexAnchor = if @search_contains then "" else "^"
regex = new RegExp(regexAnchor + escapedSearchText, 'i')
zregex = new RegExp(escapedSearchText, 'i')
words = searchText.toLowerCase().split(' ')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to filter empty strings out of the list of words (in case of several consecutive spaces) to avoid weird stuff in the highlighting.

@stof stof mentioned this pull request Sep 3, 2013
3 tasks
@eliasdorneles
Copy link
Author

@stof thanks for the review, man -- I'll try to fix these.
I'm not sure about the problem with the search for r</e foobar, because that string only gets replaced if it matches with the option label. Considering this, do you think it is a problem still?

@eliasdorneles
Copy link
Author

@stof hey, man! I've added a new commit fixing the stuff you pointed out.

Also, I've tested the behavior of the highlighting with the search queries you mentioned and they seemed to work fine.
I've added to my local tests an option like this:

<option value="foobar">r&lt;/mark&gt; foo foobar</option>

What I found was:

  1. Searching for </mark> does not causes an issue because get_search_text() returns the search text already escaped. :)
  2. Searching for foobar foo ends up highlighting foo inside foobar indeed, but the nested <mark> tags do not affect the behavior.

If you really think it's worthwhile, I can try coming up with some algorithm to avoid the nested highlighting but as it doesn't affect the behavior, I hesitate to add more code to this with potentially more bugs -- it is simpler as it is.

@koenpunt
Copy link
Contributor

koenpunt commented Sep 4, 2013

I tried implementing fuzzy search using a single regex, which works but then its not possible to highlight the matches.
Anyway, I do think we're going to need unit tests for this functionality. Maybe you can use the tests in my branch as a start: https://github.com/koenpunt/chosen/commits/jasmine-tests

@stof
Copy link
Contributor

stof commented Sep 4, 2013

@koenpunt Before adding specs, it would be easier to merge the PR adding the runner first so that they can be used

@stringfellow
Copy link

bump - what's up with this? Over a year old, lots of people want fuzzy search done right - will it appear in chosen master do you think, folks?

@pfiller
Copy link
Contributor

pfiller commented Sep 19, 2016

I don't yet know if fuzzy filtering will ever get added to Chosen or not, but it seems #2599 is a more up-to-date version and I'm going to close this. Thanks @eliasdorneles and very sorry it took me THREE YEARS to do anything with this.

@pfiller pfiller closed this Sep 19, 2016
@Mikk3lRo Mikk3lRo mentioned this pull request Oct 1, 2017
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants