String Functions: Nim vs Python
— Kaushal ModiWhile learning the Nim language and trying to correlate that with my Python 3 knowledge, I came across this awesome comparison table of string manipulation functions between the two languages.
My utmost gratitude goes to the developers of Nim, Python, Org,
ob-nim and ob-python, and of course Hugo which allowed me to
publish my notes in this presentable format.
Here are the code samples and their outputs. In each section below, you will find a Python code snippet, followed by its output, and then the same implementation in Nim, followed by the output of that.
Updates #
- Update to Nim 1.5.1
- Update to Nim 1.1.1
- Update to Nim 0.19.0 (just confirmed that all code blocks work as before—needed no modification).
- Add Better IsLower/IsUpper section; update Python to 3.7.0 and Nim to the latest devel as of today.
- Update the Nim snippets output using the improved
echo! Theechooutput difference is notable in the.splitexamples. This fixes the issue about confusingechooutputs that I raised in Nim Issue #6225. Big thanks to @bluenote10 from GitHub for Nim PR #6825! - Update the Understanding the
^Nsyntax example that gave incorrect output before Nim Issue #6223 got fixed. - Update the
.joinexample that did not work before Nim Issue #6210 got fixed. - Use the binary operator
..<instead of the combination of binary operator..and the deprecated unary<operator.
String slicing #
All characters except last #
str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str[:-1])
a bc def aghij cklm danopqrstuv adefwxyz zy
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
# Always add a space around the .. and ..< operators
echo str[0 ..< str.high]
# or
echo str[0 .. ^2]
a bc def aghij cklm danopqrstuv adefwxyz zy
a bc def aghij cklm danopqrstuv adefwxyz zy
Understanding the ^N syntax #
var str = "abc"
# Always add a space around the .. and ..< operators
echo "1st char(0) to last, including \\0(^0) : ", str[0 .. ^0] # Interestingly, this also prints the NULL character in the output.. looks like "abc^@" in Emacs
echo "1st char(0) to last (^1) «3rd» : ", str[0 .. ^1]
echo "1st char(0) to 2nd-to-last(^2) «2nd» : ", str[0 .. ^2]
echo "1st char(0) to 3rd-to-last(^3) «1st» : ", str[0 .. ^3]
echo "1st char(0) to 4th-to-last(^4) «0th» : ", str[0 .. ^4]
# echo "1st char(0) to 4th-to-last(^4) «0th» : ", str[0 .. ^5] # Error: unhandled exception: value out of range: -1 [RangeError]
# echo "2nd char(1) to 4th-to-last(^4) «0th» : ", str[1 .. ^4] # Error: unhandled exception: value out of range: -1 [RangeError]
echo "2nd char(1) to 3rd-to-last(^3) «1st» : ", str[1 .. ^3]
echo "2nd char(1) to 2nd-to-last(^2) «2nd» : ", str[1 .. ^2]
echo "2nd char(1) to last, (^1) «3rd» : ", str[1 .. ^1]
echo "Now going a bit crazy .."
echo " 2nd-to-last(^2) «2nd» char to 3rd(2) : ", str[^2 .. 2]
echo " 2nd-to-last(^2) «2nd» char to last(^1) «3rd» : ", str[^2 .. ^1]
echo " 3rd-to-last(^3) «1st» char to 3rd(2) : ", str[^3 .. 2]
1st char(0) to last, including \0(^0) : abc
1st char(0) to last (^1) «3rd» : abc
1st char(0) to 2nd-to-last(^2) «2nd» : ab
1st char(0) to 3rd-to-last(^3) «1st» : a
1st char(0) to 4th-to-last(^4) «0th» :
2nd char(1) to 3rd-to-last(^3) «1st» :
2nd char(1) to 2nd-to-last(^2) «2nd» : b
2nd char(1) to last, (^1) «3rd» : bc
Now going a bit crazy ..
2nd-to-last(^2) «2nd» char to 3rd(2) : bc
2nd-to-last(^2) «2nd» char to last(^1) «3rd» : bc
3rd-to-last(^3) «1st» char to 3rd(2) : abc
- Notes
It is recommended to always use a space around the
..and..<binary operators to get consistent results (and no compilation errors!). Examples:[0 ..< str.high],[0 .. str.high],[0 .. ^2],[ .. ^2]. This is based on the tip by @Araq from GitHub (also one of the core devs of Nim). You will find the full discussion around this topic of dots and spaces in Nim Issue #6216.Special ascii chars like
%.&$are collected into a single operator token. – AraqTo repeat: Always add a space around the
..and..<operators.As of 70ea45cdba, the
<unary operator is deprecated! So do[0 ..< str.high]instead of[0 .. <str.high](see Nim Issue #6788).With the example
strvariable value being"abc", earlier bothstr[0 .. ^5]andstr[1 .. ^4]returned an empty string incorrectly! (see Nim Issue #6223). That now got fixed in b74a5148a9. After the fix, those will cause this error:system.nim(3534) [] system.nim(2819) sysFatal Error: unhandled exception: value out of range: -1 [RangeError]Also after this fix,
str[0 .. ^0]outputsabc^@(where^@is the representation of NULL character).. very cool!
All characters except first #
str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str[1:])
bc def aghij cklm danopqrstuv adefwxyz zyx
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
# echo str[1 .. ] # Does not work.. Error: expression expected, but found ']'
# https://github.com/nim-lang/Nim/issues/6212
# Always add a space around the .. and ..< operators
echo str[1 .. str.high]
# or
echo str[1 .. ^1] # second(1) to last(^1)
bc def aghij cklm danopqrstuv adefwxyz zyx
bc def aghij cklm danopqrstuv adefwxyz zyx
All characters except first and last #
str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str[1:-1])
bc def aghij cklm danopqrstuv adefwxyz zy
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
# Always add a space around the .. and ..< operators
echo str[1 ..< str.high]
# or
echo str[1 .. ^2] # second(1) to second-to-last(^2)
bc def aghij cklm danopqrstuv adefwxyz zy
bc def aghij cklm danopqrstuv adefwxyz zy
Count #
str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.count('a'))
print(str.count('de'))
4
2
import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
echo str.count('a')
echo str.count("de")
4
2
Starts/ends with #
Starts With #
str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.startswith('a'))
print(str.startswith('a\t'))
print(str.startswith('z'))
True
True
False
import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
echo str.startsWith('a') # Recommended Nim style
# or
echo str.startswith('a')
# or
echo str.starts_with('a')
echo str.startsWith("a\t")
echo str.startsWith('z')
true
true
true
true
false
- Notes
- All Nim identifiers are case and underscore insensitive (except
for the first character of the identifier), as seen in the above
example. So any of
startsWithorstartswithorstarts_withwould work the exact same way. - Though, it has to be noted that using the camelCase variant
(
startsWith) is preferred in Nim.
- All Nim identifiers are case and underscore insensitive (except
for the first character of the identifier), as seen in the above
example. So any of
Ends With #
str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.endswith('x'))
print(str.endswith('yx'))
print(str.endswith('z'))
True
True
False
import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
echo str.endsWith('x')
echo str.endsWith("yx")
echo str.endsWith('z')
true
true
false
Expand Tabs #
str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.expandtabs())
print(str.expandtabs(4))
a bc def aghij cklm danopqrstuv adefwxyz zyx
a bc def aghij cklm danopqrstuv adefwxyz zyx
import strmisc
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
echo str.expandTabs()
echo str.expandTabs(4)
a bc def aghij cklm danopqrstuv adefwxyz zyx
a bc def aghij cklm danopqrstuv adefwxyz zyx
Find/Index #
Find (from left) #
str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.find('a'))
print(str.find('b'))
print(str.find('c'))
print(str.find('zyx'))
print(str.find('aaa'))
0
2
3
41
-1
import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
echo str.find('a')
echo str.find('b')
echo str.find('c')
echo str.find("zyx")
echo str.find("aaa")
0
2
3
41
-1
Find from right #
str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.rfind('a'))
print(str.rfind('b'))
print(str.rfind('c'))
print(str.rfind('zyx'))
print(str.rfind('aaa'))
32
2
15
41
-1
import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
echo str.rfind('a')
echo str.rfind('b')
echo str.rfind('c')
echo str.rfind("zyx")
echo str.rfind("aaa")
32
2
15
41
-1
Index (from left) #
From Python 3 docs,
Like
find(), but raiseValueErrorwhen the substring is not found.
str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.index('a'))
print(str.index('b'))
print(str.index('c'))
print(str.index('zyx'))
# print(str.index('aaa')) # Throws ValueError: substring not found
0
2
3
41
Nim does not have an error raising index function like that
out-of-box, but something like that can be done with:
import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
# https://nim-lang.org/docs/strutils.html#find,string,string,Natural,Natural
# proc find(s, sub: string; start: Natural = 0; last: Natural = 0): int {..}
proc index(s, sub: auto; start: Natural = 0; last: Natural = 0): int =
result = s.find(sub, start, last)
if result<0:
raise newException(ValueError, "$1 not found in $2".format(sub, s))
echo str.index('a')
echo str.index('b')
echo str.index('c')
echo str.index("zyx")
# echo str.index("aaa") # Error: unhandled exception: aaa not found in a bc def aghij cklm danopqrstuv adefwxyz zyx [ValueError]
0
2
3
41
- Notes
- No Nim equivalent, but I came up with my own
indexproc for Nim above.
- No Nim equivalent, but I came up with my own
Index from right #
From Python 3 docs,
Like
rfind(), but raiseValueErrorwhen the substring is not found.
str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.rindex('a'))
print(str.rindex('b'))
print(str.rindex('c'))
print(str.rindex('zyx'))
# print(str.rindex('aaa')) # Throws ValueError: substring not found
32
2
15
41
Nim does not have an error raising rindex function like that
out-of-box, but something like that can be done with:
import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
# https://nim-lang.github.io/Nim/strutils.html#rfind%2Cstring%2Cstring%2CNatural
# proc rfind(s, sub: string; start: Natural = 0; last = - 1): int {..}
proc rindex(s, sub: auto; start: Natural = 0; last = - 1): int =
result = s.rfind(sub, start, last)
if result<0:
raise newException(ValueError, "$1 not found in $2".format(sub, s))
echo str.rindex('a')
echo str.rindex('b')
echo str.rindex('c')
echo str.rindex("zyx")
# echo str.rindex("aaa") # Error: unhandled exception: aaa not found in a bc def aghij cklm danopqrstuv adefwxyz zyx [ValueError]
32
2
15
41
- Notes
- No Nim equivalent, but I came up with my own
rindexproc for Nim above.
- No Nim equivalent, but I came up with my own
String Predicates #
Is Alphanumeric? #
print('abc'.isalnum())
print('012'.isalnum())
print('abc012'.isalnum())
print('abc012_'.isalnum())
print('{}'.isalnum())
print('Unicode:')
print('ábc'.isalnum())
True
True
True
False
False
Unicode:
True
import std/[strutils, sequtils]
echo "abc".allIt(it.isAlphaNumeric())
echo "012".allIt(it.isAlphaNumeric())
echo "abc012".allIt(it.isAlphaNumeric())
echo "abc012_".allIt(it.isAlphaNumeric())
echo "{}".allIt(it.isAlphaNumeric())
echo "[Wrong] ", "ábc".allIt(it.isAlphaNumeric()) # Returns false! isAlphaNumeric works only for ascii.
true
true
true
false
false
[Wrong] false
TODO Figure out how to write unicode-equivalent of isAlphaNumeric #
Is Alpha? #
print('abc'.isalpha())
print('012'.isalpha())
print('abc012'.isalpha())
print('abc012_'.isalpha())
print('{}'.isalpha())
print('Unicode:')
print('ábc'.isalpha())
True
False
False
False
False
Unicode:
True
import strutils except isAlpha
import std/[unicode, sequtils]
echo "abc".allIt(it.isAlphaAscii())
echo "012".allIt(it.isAlphaAscii())
echo "abc012".allIt(it.isAlphaAscii())
echo "abc012_".allIt(it.isAlphaAscii())
echo "{}".allIt(it.isAlphaAscii())
echo "Unicode:"
echo unicode.isAlpha("ábc")
# or
echo isAlpha("ábc") # unicode prefix is not needed
# because of import strutils except isAlpha
# or
echo "ábc".isAlpha() # from unicode
true
false
false
false
false
Unicode:
true
true
true
- Notes
Thanks to the tip from @dom96 from GitHub on the use of
exceptinimport:import strutils except isAlpha import unicodeThat prevents the ambiguous call error like below as we are specifying that
importeverything fromstrutils,exceptfor theisAlphaproc. Thus theunicodeversion ofisAlphaproc is used automatically.nim_src_28505flZ.nim(14, 13) Error: ambiguous call; both strutils.isAlpha(s: string)[declared in lib/pure/strutils.nim(289, 5)] and unicode.isAlpha(s: string)[declared in lib/pure/unicode.nim(1416, 5)] match for: (string)
Is Digit? #
print('abc'.isdigit())
print('012'.isdigit())
print('abc012'.isdigit())
print('abc012_'.isdigit())
print('{}'.isdigit())
False
True
False
False
False
import std/[strutils, sequtils]
echo "abc".allIt(it.isDigit())
echo "012".allIt(it.isDigit())
echo "abc012".allIt(it.isDigit())
echo "abc012_".allIt(it.isDigit())
echo "{}".allIt(it.isDigit())
false
true
false
false
false
Better IsLower/IsUpper #
Nim Issue #7963 did not get resolved as I would have liked. This
section has isLowerAsciiPlus, isUpperAsciiPlus, isLowerPlus,
isUpperPlus procs that accept a string input that replace their
non-Plus equivalents from strutils and unicode modules.
import strutils except isLower, isUpper
import unicode
template isCaseImpl(s, charProc) =
var hasAtleastOneAlphaChar = false
if s.len == 0: return false
for c in s:
var charIsAlpha = c.isAlphaAscii()
if not hasAtleastOneAlphaChar:
hasAtleastOneAlphaChar = charIsAlpha
if charIsAlpha and (not charProc(c)):
return false
return hasAtleastOneAlphaChar
proc isLowerAsciiPlus(s: string): bool =
## Checks whether ``s`` is lower case.
##
## This checks ASCII characters only.
##
## Returns true if all alphabetical characters in ``s`` are lower
## case. Returns false if none of the characters in ``s`` are
## alphabetical.
##
## Returns false if ``s`` is an empty string.
isCaseImpl(s, isLowerAscii)
proc isUpperAsciiPlus(s: string): bool =
## Checks whether ``s`` is upper case.
##
## This checks ASCII characters only.
##
## Returns true if all alphabetical characters in ``s`` are upper
## case. Returns false if none of the characters in ``s`` are
## alphabetical.
##
## Returns false if ``s`` is an empty string.
isCaseImpl(s, isUpperAscii)
template runeCaseCheck(s, runeProc) =
## Common code for rune.isLower and rune.isUpper.
if len(s) == 0: return false
var
i = 0
rune: Rune
hasAtleastOneAlphaRune = false
while i < len(s):
fastRuneAt(s, i, rune, doInc=true)
var runeIsAlpha = isAlpha(rune)
if not hasAtleastOneAlphaRune:
hasAtleastOneAlphaRune = runeIsAlpha
if runeIsAlpha and (not runeProc(rune)):
return false
return hasAtleastOneAlphaRune
proc isLowerPlus(s: string): bool =
## Checks whether ``s`` is lower case.
##
## Returns true if all alphabetical runes in ``s`` are lower case.
## Returns false if none of the runes in ``s`` are alphabetical.
##
## Returns false if ``s`` is an empty string.
runeCaseCheck(s, isLower)
proc isUpperPlus(s: string): bool =
## Checks whether ``s`` is upper case.
##
## Returns true if all alphabetical runes in ``s`` are upper case.
## Returns false if none of the runes in ``s`` are alphabetical.
##
## Returns false if ``s`` is an empty string.
runeCaseCheck(s, isUpper)
Is Lower? #
print('a'.islower())
print('A'.islower())
print('abc'.islower())
print('Abc'.islower())
print('aBc'.islower())
print('012'.islower())
print('{}'.islower())
print('ABC'.islower())
print('À'.islower())
print('à'.islower())
print('a b'.islower()) # Precedence for https://github.com/nim-lang/Nim/issues/7963
print('ab?!'.islower()) # Precedence for https://github.com/nim-lang/Nim/issues/7963
print('1, 2, 3 go!'.islower()) # Precedence for https://github.com/nim-lang/Nim/issues/7963
print(' '.islower()) # checking this proc on a non-alphabet char
print('(*&#@(^#$ '.islower()) # checking this proc on a non-alphabet string
True
False
True
False
False
False
False
False
False
True
True
True
True
False
False
<<islower_isupper_plus>>
echo 'a'.isLowerAscii()
echo 'A'.isLowerAscii()
echo "abc".isLowerAsciiPlus()
echo "Abc".isLowerAsciiPlus()
echo "aBc".isLowerAsciiPlus()
echo "012".isLowerAsciiPlus()
echo "{}".isLowerAsciiPlus()
echo "ABC".isLowerAsciiPlus()
echo "À".isLowerAsciiPlus()
echo "[Wrong] ", "à".isLowerAsciiPlus() # Returns false! As the name suggests, works only for ascii.
echo "À".isLowerPlus()
echo isLowerPlus("à")
echo "à".isLowerPlus()
echo "a b".isLowerAsciiPlus()
echo "a b".isLowerPlus()
echo "ab?!".isLowerPlus()
echo "1, 2, 3 go!".isLowerPlus()
echo ' '.isLowerAscii() # checking this proc on a non-alphabet char
echo ' '.Rune.isLower() # checking this proc on a non-alphabet Rune
echo "(*&#@(^#$ ".isLowerPlus() # checking this proc on a non-alphabet string
true
false
true
false
false
false
false
false
false
[Wrong] false
false
true
true
true
true
true
true
false
false
false
DONE Presence of space and punctuations in string makes isLower return false #
Nim Issue #7963 did not get resolved as I would have liked. So I just rolled my own procs in Better IsLower/IsUpper to fix this issue.
- Notes
isLowerfromstrutilsis deprecated. UseisLowerAsciiinstead, orisLowerfromunicode(as done above).- To check if a non-ascii alphabet is in lower case, use
unicode.isLower.
Is Upper? #
print('a'.isupper())
print('A'.isupper())
print('abc'.isupper())
print('Abc'.isupper())
print('aBc'.isupper())
print('012'.isupper())
print('{}'.isupper())
print('ABC'.isupper())
print('À'.isupper())
print('à'.isupper())
print('A B'.isupper()) # Precedence for https://github.com/nim-lang/Nim/issues/7963
print('AB?!'.isupper()) # Precedence for https://github.com/nim-lang/Nim/issues/7963
print('1, 2, 3 GO!'.isupper()) # Precedence for https://github.com/nim-lang/Nim/issues/7963
print(' '.isupper()) # checking this function on a non-alphabet char
print('(*&#@(^#$ '.isupper()) # checking this proc on a non-alphabet string
False
True
False
False
False
False
False
True
True
False
True
True
True
False
False
<<islower_isupper_plus>>
echo 'a'.isUpperAscii()
echo 'A'.isUpperAscii()
echo "abc".isUpperAsciiPlus()
echo "Abc".isUpperAsciiPlus()
echo "aBc".isUpperAsciiPlus()
echo "012".isUpperAsciiPlus()
echo "{}".isUpperAsciiPlus()
echo "ABC".isUpperAsciiPlus()
echo "[Wrong] ", "À".isUpperAsciiPlus() # Returns false! As the name suggests, works only for ascii.
echo "à".isUpperAsciiPlus()
echo "À".isUpperPlus() # from unicode
echo isUpperPlus("À")
echo "à".isUpperPlus() # from unicode
echo "A B".isUpperAsciiPlus() #
echo "A B".isUpperPlus()
echo "AB?!".isUpperPlus()
echo "1, 2, 3 GO!".isUpperPlus()
echo ' '.isUpperAscii() # checking this proc on a non-alphabet char
echo ' '.Rune.isUpper() # checking this proc on a non-alphabet Rune
echo "(*&#@(^#$ ".isUpperPlus() # checking this proc on a non-alphabet string
false
true
false
false
false
false
false
true
[Wrong] false
false
true
true
false
true
true
true
true
false
false
false
DONE Presence of space and punctuations in string makes isUpper return false #
Nim Issue #7963 did not get resolved as I would have liked. So I just rolled my own procs in Better IsLower/IsUpper to fix this issue.
Is Space? #
print(''.isspace())
print(' '.isspace())
print('\t'.isspace())
print('\r'.isspace())
print('\n'.isspace())
print(' \t\n'.isspace())
print('abc'.isspace())
print('Testing with ZERO WIDTH SPACE unicode character below:')
print(''.isspace())
False
True
True
True
True
True
False
Testing with ZERO WIDTH SPACE unicode character below:
False
import strutils except isSpace
import std/[unicode, sequtils]
proc isSpaceAscii(s: string): bool =
if s == "":
return false
s.allIt(it.isSpaceAscii())
echo "".isSpaceAscii() # empty string has to be in double quotes
echo ' '.isSpaceAscii()
echo '\t'.isSpaceAscii()
echo '\r'.isSpaceAscii()
echo "\n".isSpaceAscii() # \n is a string, not a character in Nim
echo " \t\n".isSpaceAscii()
echo "abc".isSpaceAscii()
echo "Testing with ZERO WIDTH SPACE unicode character below:"
echo "[Wrong] ", "".isSpaceAscii() # Returns false! As the name suggests, works only for ascii.
echo "".isSpace() # from unicode
false
true
true
true
true
true
false
Testing with ZERO WIDTH SPACE unicode character below:
[Wrong] false
false
- Notes
- Empty string results in a false result for both Python and Nim
variants of
isspace. \nis a string, not a character in Nim, because based on the OS,\ncan comprise of one or more characters.isSpacefromstrutilsis deprecated. UseisSpaceAsciiinstead, orisSpacefromunicode(as done above).- To check if a non-ascii alphabet is in space case, use
unicode.isSpace. - Interestingly, Nim’s
isSpacefromunicodemodule returns true forZERO WIDTH SPACEunicode character (0x200b) as input, but Python’sisspacereturns false. I believe Python’s behavior here is incorrect.
- Empty string results in a false result for both Python and Nim
variants of
Is Title? #
print(''.istitle())
print('T'.istitle())
print('Dž'.istitle())
print('The Quick? (“Brown”) Fox Can’t Jump 32.3 Feet, Right?'.istitle()) # Python's output is wrong
print('this is not a title'.istitle())
print('This Is A Title'.istitle())
print('This Is À Title'.istitle())
print('This Is Not a Title'.istitle())
False
True
True
False
False
True
True
False
import std/[unicode, strformat]
# https://github.com/nim-lang/Nim/issues/14348#issuecomment-629414257
proc isTitle(s: string): bool =
proc isUpperOrTitle(r: Rune): bool = r.isUpper() or r.isTitle()
var
alphaSeen = false
for word in s.split(): # Split s into a sequence of words
result = true
var
upperSeen = false
let
runes = word.toRunes()
for r in runes:
if not r.isAlpha():
continue
alphaSeen = true
if not upperSeen:
if r.isUpperOrTitle():
upperSeen = true
else:
return false
else:
if r.isUpperOrTitle():
return false
if not alphaSeen:
return false
echo "".isTitle()
echo "T".isTitle()
echo "Dž".isTitle()
echo "The Quick? (“Brown”) Fox Can’t Jump 32.3 Feet, Right?".isTitle()
echo "this is not a title".isTitle()
echo "This Is A Title".isTitle()
echo "This Is À Title".isTitle()
echo "This Is Not a Title".isTitle()
false
true
true
true
false
true
true
false
Join #
print(' '.join(['a', 'b', 'c']))
print('xx'.join(['a', 'b', 'c']))
a b c
axxbxxc
import strutils
echo "Sequences:"
# echo @["a", "b", "c"].join(' ') # Error: type mismatch: got (seq[string], char)
echo @["a", "b", "c"].join(" ")
echo join(@["a", "b", "c"], " ")
echo @["a", "b", "c"].join("xx")
echo @['a', 'b', 'c'].join("") # join characters to form strings
echo "Lists:"
echo ["a", "b", "c"].join(" ") # Works after Nim issue # 6210 got fixed.
echo (["a", "b", "c"].join(" ")) # Works!
echo join(["a", "b", "c"], " ") # Works!
var list = ["a", "b", "c"]
echo list.join(" ") # Works too!
Sequences:
a b c
a b c
axxbxxc
abc
Lists:
a b c
a b c
a b c
a b c
- Notes
- The second arg to join, the separator argument has to be a string, cannot be a character.
echo ["a", "b", "c"].join(" ")did not work prior to the fix in ddc131cf07 (see Nim Issue #6210), but now it does!
Justify with filling #
Center Justify with filling #
str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.center(80))
print(str.center(80, '*'))
a bc def aghij cklm danopqrstuv adefwxyz zyx
******************a bc def aghij cklm danopqrstuv adefwxyz zyx******************
import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
echo str.center(80)
echo str.center(80, '*')
# or
echo center(str, 80, '*')
a bc def aghij cklm danopqrstuv adefwxyz zyx
******************a bc def aghij cklm danopqrstuv adefwxyz zyx******************
******************a bc def aghij cklm danopqrstuv adefwxyz zyx******************
Left Justify with filling #
print('abc'.ljust(2, '*'))
print('abc'.ljust(10, '*'))
abc
abc*******
import strutils
echo "abc".alignLeft(2, '*')
echo "abc".alignLeft(10, '*')
abc
abc*******
Right Justify with filling #
print('abc'.rjust(10, '*'))
*******abc
import strutils
echo "abc".align(10, '*')
*******abc
Zero Fill #
print('42'.zfill(5))
print('-42'.zfill(5))
print(' -42'.zfill(5))
00042
-0042
0 -42
import strutils
echo "Using align:"
echo "42".align(5, '0')
echo "-42".align(5, '0')
echo "Using zfill:"
proc zfill(s: string; count: Natural): string =
let strlen = len(s)
if strlen < count:
if s[0]=='-':
result = "-"
result.add("0".repeat(count-strlen))
result.add(s[1 .. s.high])
else:
result = "0".repeat(count-strlen)
result.add(s)
else:
result = s
echo "42".zfill(5)
echo "-42".zfill(5)
echo " -42".zfill(5)
Using align:
00042
00-42
Using zfill:
00042
-0042
0 -42
- Notes
- The
alignin Nim does not do the right thing as the Pythonzfilldoes when filling zeroes on the left in strings representing negative numbers. - No Nim equivalent, but I came up with my own
zfillproc for Nim above.
- The
Case conversion #
To Lower #
print('a'.lower())
print('A'.lower())
print('abc'.lower())
print('Abc'.lower())
print('aBc'.lower())
print('012'.lower())
print('{}'.lower())
print('ABC'.lower())
print('À'.lower())
print('à'.lower())
a
a
abc
abc
abc
012
{}
abc
à
à
import strutils except toLower
import unicode
echo 'a'.toLowerAscii()
echo 'A'.toLowerAscii()
echo "abc".toLowerAscii()
echo "Abc".toLowerAscii()
echo "aBc".toLowerAscii()
echo "012".toLowerAscii()
echo "{}".toLowerAscii()
echo "ABC".toLowerAscii()
echo "[Wrong] ", "À".toLowerAscii() # Does not work! As the name suggests, works only for ascii.
echo "à".toLowerAscii()
echo "À".toLower() # from unicode
echo "à".toLower() # from unicode
a
a
abc
abc
abc
012
{}
abc
[Wrong] À
à
à
à
- Notes
toLowerfromstrutilsis deprecated. UsetoLowerAsciiinstead, ortoLowerfromunicode(as done above).- To convert a non-ascii alphabet to lower case, use
unicode.toLower.
To Upper #
print('a'.upper())
print('A'.upper())
print('abc'.upper())
print('Abc'.upper())
print('aBc'.upper())
print('012'.upper())
print('{}'.upper())
print('ABC'.upper())
print('À'.upper())
print('à'.upper())
A
A
ABC
ABC
ABC
012
{}
ABC
À
À
import strutils except toUpper
import unicode
echo 'a'.toUpperAscii()
echo 'A'.toUpperAscii()
echo "abc".toUpperAscii()
echo "Abc".toUpperAscii()
echo "aBc".toUpperAscii()
echo "012".toUpperAscii()
echo "{}".toUpperAscii()
echo "ABC".toUpperAscii()
echo "À".toUpperAscii()
echo "[Wrong] ", "à".toUpperAscii() # Does not work! As the name suggests, works only for ascii.
echo "À".toUpper() # from unicode
echo "à".toUpper() # from unicode
A
A
ABC
ABC
ABC
012
{}
ABC
À
[Wrong] à
À
À
- Notes
toUpperfromstrutilsis deprecated. UsetoUpperAsciiinstead, ortoUpperfromunicode(as done above).- To convert a non-ascii alphabet to upper case, use
unicode.toUpper.
Capitalize #
str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.capitalize())
A bc def aghij cklm danopqrstuv adefwxyz zyx
import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
echo str.capitalizeAscii
# or
echo capitalizeAscii(str)
A bc def aghij cklm danopqrstuv adefwxyz zyx
A bc def aghij cklm danopqrstuv adefwxyz zyx
To Title #
print('convert this to title á û'.title())
Convert This To Title Á Û
import unicode
echo "convert this to title á û".title()
Convert This To Title Á Û
Swap Case #
print('Swap CASE example á û Ê'.swapcase())
print('Swap CASE example á û Ê'.swapcase().swapcase())
sWAP case EXAMPLE Á Û ê
Swap CASE example á û Ê
import unicode
echo "Swap CASE example á û Ê".swapcase()
echo "Swap CASE example á û Ê".swapcase().swapcase()
sWAP case EXAMPLE Á Û ê
Swap CASE example á û Ê
- Notes
- See this SO Q/A to read about few cases where
s.swapcase().swapcase()==sis not true (at least for Python).
- See this SO Q/A to read about few cases where
Strip #
Left/leading and right/trailing Strip #
print('«' + ' spacious '.strip() + '»')
print('«' + '\n string \n \n\n'.strip() + '»')
print('«' + '\n'.strip() + '»')
print('www.example.com'.strip('cmowz.'))
print('mississippi'.strip('mipz'))
«spacious»
«string»
«»
example
ssiss
import strutils
echo "«" & " spacious ".strip() & "»"
echo "«" & "\n string \n \n\n".strip() & "»"
echo "«" & "\n".strip() & "»"
echo "www.example.com".strip(chars={'c', 'm', 'o', 'w', 'z', '.'})
echo "mississippi".strip(chars={'m', 'i', 'p', 'z'})
«spacious»
«string»
«»
example
ssiss
- Notes
- Python
striptakes a string as an argument to specify the letters that need to be stripped off the input string. But Nimstriprequires a Set of characters.
- Python
Left/leading Strip #
print('«' + ' spacious '.lstrip() + '»')
print('www.example.com'.lstrip('cmowz.'))
print('mississippi'.lstrip('mipz'))
«spacious »
example.com
ssissippi
import strutils
echo "«", " spacious ".strip(trailing=false), "»"
echo "www.example.com".strip(trailing=false, chars={'c', 'm', 'o', 'w', 'z', '.'})
echo "mississippi".strip(trailing=false, chars={'m', 'i', 'p', 'z'})
«spacious »
example.com
ssissippi
Right/trailing Strip #
print('«' + ' spacious '.rstrip() + '»')
print('www.example.com'.rstrip('cmowz.'))
print('mississippi'.rstrip('mipz'))
« spacious»
www.example
mississ
import strutils
echo "«", " spacious ".strip(leading=false), "»"
echo "www.example.com".strip(leading=false, chars={'c', 'm', 'o', 'w', 'z', '.'})
echo "mississippi".strip(leading=false, chars={'m', 'i', 'p', 'z'})
« spacious»
www.example
mississ
Partition #
First occurrence partition #
print('ab:ce:ef:ce:ab'.partition(':'))
print('ab:ce:ef:ce:ab'.partition('ce'))
('ab', ':', 'ce:ef:ce:ab')
('ab:', 'ce', ':ef:ce:ab')
import strmisc
echo "ab:ce:ef:ce:ab".partition(":") # The argument is a string, not a character
echo "ab:ce:ef:ce:ab".partition("ce")
("ab", ":", "ce:ef:ce:ab")
("ab:", "ce", ":ef:ce:ab")
Right partition or Last occurrence partition #
print('ab:ce:ef:ce:ab'.rpartition(':'))
print('ab:ce:ef:ce:ab'.rpartition('ce'))
('ab:ce:ef:ce', ':', 'ab')
('ab:ce:ef:', 'ce', ':ab')
import strmisc
echo "ab:ce:ef:ce:ab".rpartition(":") # The argument is a string, not a character
# or
echo "ab:ce:ef:ce:ab".partition(":", right=true)
echo "ab:ce:ef:ce:ab".rpartition("ce")
# or
echo "ab:ce:ef:ce:ab".partition("ce", right=true)
("ab:ce:ef:ce", ":", "ab")
("ab:ce:ef:ce", ":", "ab")
("ab:ce:ef:", "ce", ":ab")
("ab:ce:ef:", "ce", ":ab")
Replace #
print('abc abc abc'.replace(' ab', '-xy'))
print('abc abc abc'.replace(' ', '')) # Strip all spaces
print('abc abc abc'.replace(' ab', '-xy', 0))
print('abc abc abc'.replace(' ab', '-xy', 1))
print('abc abc abc'.replace(' ab', '-xy', 2))
abc-xyc-xyc
abcabcabc
abc abc abc
abc-xyc abc
abc-xyc-xyc
import strutils
echo "abc abc abc".replace(" ab", "-xy")
echo "abc abc abc".replace(" ", "") # Strip all spaces
# echo "abc abc abc".replace(" ab", "-xy", 0) # Invalid, does not expect a count:int argument
# echo "abc abc abc".replace(" ab", "-xy", 1) # Invalid, does not expect a count:int argument
# echo "abc abc abc".replace(" ab", "-xy", 2) # Invalid, does not expect a count:int argument
abc-xyc-xyc
abcabcabc
- Notes
- Nim does not allow specifying the number of occurrences to be
replaced using a
countargument as in the Python version ofreplace.
- Nim does not allow specifying the number of occurrences to be
replaced using a
Split #
Split (from left) #
print('1,2,3'.split(','))
print('1,2,3'.split(',', maxsplit=1))
print('1,2,,3,'.split(','))
print('1::2::3'.split('::'))
print('1::2::3'.split('::', maxsplit=1))
print('1::2::::3::'.split('::'))
['1', '2', '3']
['1', '2,3']
['1', '2', '', '3', '']
['1', '2', '3']
['1', '2::3']
['1', '2', '', '3', '']
import strutils
echo "1,2,3".split(',')
echo "1,2,3".split(',', maxsplit=1)
echo "1,2,,3,".split(',')
echo "1::2::3".split("::")
echo "1::2::3".split("::", maxsplit=1)
echo "1::2::::3::".split("::")
@["1", "2", "3"]
@["1", "2,3"]
@["1", "2", "", "3", ""]
@["1", "2", "3"]
@["1", "2::3"]
@["1", "2", "", "3", ""]
Split from right #
rsplit behaves just like split unless the maxsplit argument is
given
print('1,2,3'.rsplit(','))
print('1,2,3'.rsplit(',', maxsplit=1))
print('1,2,,3,'.rsplit(','))
print('1::2::3'.rsplit('::'))
print('1::2::3'.rsplit('::', maxsplit=1))
print('1::2::::3::'.rsplit('::'))
['1', '2', '3']
['1,2', '3']
['1', '2', '', '3', '']
['1', '2', '3']
['1::2', '3']
['1', '2', '', '3', '']
import strutils
echo "1,2,3".rsplit(',')
echo "1,2,3".rsplit(',', maxsplit=1)
echo "1,2,,3,".rsplit(',')
echo "1::2::3".rsplit("::")
echo "1::2::3".rsplit("::", maxsplit=1)
echo "1::2::::3::".rsplit("::")
@["1", "2", "3"]
@["1,2", "3"]
@["1", "2", "", "3", ""]
@["1", "2", "3"]
@["1::2", "3"]
@["1", "2", "", "", "3", ""]
Split Lines #
print('ab c\n\nde fg\rkl\r\n'.splitlines())
print('ab c\n\nde fg\rkl\r\n'.splitlines(keepends=True))
['ab c', '', 'de fg', 'kl']
['ab c\n', '\n', 'de fg\r', 'kl\r\n']
import strutils
echo "ab c\n\nde fg\rkl\r\n".splitLines()
echo "ab c\n\nde fg\rkl\r\n".splitLines(keepEol = true)
@["ab c", "", "de fg", "kl", ""]
@["ab c\n", "\n", "de fg\r", "kl\r\n", ""]
- Notes
- The Nim version creates separate splits for the
\rand\n. Note the last""split created by Nim, but not by Python for the same input string.
- The Nim version creates separate splits for the
Convert #
See the encodings module for equivalents of Python decode and
encode functions.
Others #
There is no equivalent for the Python translate function ,
in Nim as of writing this ().