Here are two strings:
>>> size
'5µm'
>>> another_size
'5μm'
And here's their comparison:
>>> size == another_size
False
Why?
The answer may or may not be obvious depending on the font you're using
For example, here's a code snippet of the same code using a different font – look closely
Here's another font, to make it even clearer:
The characters used for µ are not the same
One is the micro sign <"\u00b5" or chr(181)>
The other is the lowercase Greek letter mu <"\u03bc" or chr(956)>
If you're comparing texts from different sources which may use either character, a simple comparison may return `False`
Solution: use `casefold()` when comparing strings in this scenario. `casefold()` is one of the string methods and is similar to `upper()` and `lower()` but takes care of edge cases such as µ
For these characters, `casefold()` always returns the lowercase Greek mu which is the preferred character
>>> chr(181)
'µ'
>>> chr(956)
'μ'
>>> ord(chr(181).casefold())
956
>>> ord(chr(956).casefold())
956
More generally, `casefold()` is used to match lowercase and uppercase strings and is a better solution than converting strings either using `upper()` or `lower()` because of some of these edge cases like µ or ß, for example
The German letter ß is another common example for which `casefold()` is needed since it's equivalent to ss:
>>> "ß".casefold()
'ss'
>>> "groß" == "gross"
False
>>> "groß".casefold() == "gross".casefold()
True
Another use case you may come across is texts with ligatures such as 'fl' if this is not converted to the separate f and l