In lodash there is the words method that can be used to quickly preform lexical analysis tokenization of a string. In other words the lodash words method is used to split a string into an array of words. In some cases this could be easily done with the split method, but it is not always so cut and dry. There are text samples that might contain certain characters that are to be cut out or included in the process for example. So that being said there is a need for some kind of Tokenizer method that is better suited for the task of creating an array of words from a text sample.
If The full lodash version of lodash is part of the stack of the project that you are working on there is the _.words method that can be used to quickly get an array of words from a text sample in a string. The default pattern that is used should work okay in most situations, but if for some reason it does not it is possible to override that pattern by passing a pattern to use as the second argument when using the method.
In some situations I might want to do some processing for the text before hand, or use a custom pattern that can be given as the second argument to lodash words. Say that I have some text that has camel case words in it, or in other words it has some words that start out lower case but then have an upper case letter in it. The default pattern that is used in lodash words will break that kind of word into two or more words, which might not be the result that I want.
So then the solution is to make all the text lowercase before I pass it to lodash words, or use a custom pattern that will not split up words like that.