It's like when you say you want to replace different instances of JavaScript. If you want JavaScript formated the same way all the time then you can use this technique to achieve that.
print(kp.replace_keywords(normalize('My name is remy')))
print(kp.replace_keywords(normalize('My name is RÉMY')))
print(kp.replace_keywords(normalize('My name is Rémy')))
output:
my name is Rémy
my name is Rémy
my name is Rémy
Yup but then you're getting my name is Rémy instead of My name is Rémy.
Also it would allow to process the string without holding it several times in memory (and thus possibly to work on a stream). If you're dealing with big texts it might be interesting as well
I don't have a direct application right now though, but from the things I usually do I'm guessing it would make sense.
@remy : Sorry, I didn't get that completely. Can you please elaborate on the expected output and how normalise function is making it happen?
Suppose that your input is one of
My name is remy
My name is RÉMY
My name is Rémy
Then your output would be
My name is Rémy
It's like when you say you want to replace different instances of
JavaScript
. If you wantJavaScript
formated the same way all the time then you can use this technique to achieve that.That can already be done right?
Yup but then you're getting
my name is Rémy
instead ofMy name is Rémy
.Also it would allow to process the string without holding it several times in memory (and thus possibly to work on a stream). If you're dealing with big texts it might be interesting as well
I don't have a direct application right now though, but from the things I usually do I'm guessing it would make sense.
Ok Remy, Btw, if we change normalize method to not lower the text your requirement will be solved.
Also, if I call normalize from within FlashText or outside FlashText it will be the same amount of memory and computation.
Still, I will keep looking for a possible use case for your suggestion. Thanks for bringing it up :) :)