Emacs, scripting and anything text oriented.

String Functions: Nim vs Python

While learning the Nim language and trying to correlate that with my Python 3 knowledge, I came across this awesome comparison table of string manipulation functions between the two languages.

My utmost gratitude goes to the developers of Nim, Python, Org, ob-nim and ob-python, and of course Hugo which allowed me to publish my notes in this presentable format.

Here are the code samples and their outputs. In each mini-section below, you will find a Python code snippet, followed by its output, and then the same implementation in Nim, followed by the output of that.

The tool versions used are:

  • Python 3.6.2
  • Nim 0.17.3 (2017-10-04) [Linux: amd64] git hash: 0b7b116 (devel branch)

String slicing

All characters except last

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str[:-1])
a	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zy
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
# Always add a space around the .. operator
echo str[0 .. <str.high]
# or
echo str[0 .. ^2]
# or
echo str[ .. ^2]
a	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zy
a	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zy
a	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zy

Understanding the ^N syntax

var str = "abc"
# Always add a space around the .. operator
echo "1st char(0) to last, including \\0(^0) : ", str[0 .. ^0]
echo "1st char(0) to last       (^1) «3rd»  : ", str[0 .. ^1]
echo "1st char(0) to 2nd-to-last(^2) «2nd»  : ", str[0 .. ^2]
echo "1st char(0) to 3rd-to-last(^3) «1st»  : ", str[0 .. ^3]
echo "1st char(0) to 4th-to-last(^4) «0th»  : ", str[0 .. ^4]
echo "2nd char(1) to 4th-to-last(^4) «0th»  : ", str[1 .. ^4]
echo "2nd char(1) to 3rd-to-last(^3) «1st»  : ", str[1 .. ^3]
echo "2nd char(1) to 2nd-to-last(^2) «2nd»  : ", str[1 .. ^2]
echo "2nd char(1) to last,      (^1) «3rd»  : ", str[1 .. ^1]
echo "Now going a bit crazy .."
echo " 2nd-to-last(^2) «2nd» char to 3rd(2)         : ", str[^2 .. 2]
echo " 2nd-to-last(^2) «2nd» char to last(^1) «3rd» : ", str[^2 .. ^1]
echo " 3rd-to-last(^3) «1st» char to 3rd(2)         : ", str[^3 .. 2]
1st char(0) to last, including \0(^0) : abc
1st char(0) to last       (^1) «3rd»  : abc
1st char(0) to 2nd-to-last(^2) «2nd»  : ab
1st char(0) to 3rd-to-last(^3) «1st»  : a
1st char(0) to 4th-to-last(^4) «0th»  :
2nd char(1) to 4th-to-last(^4) «0th»  :
2nd char(1) to 3rd-to-last(^3) «1st»  :
2nd char(1) to 2nd-to-last(^2) «2nd»  : b
2nd char(1) to last,      (^1) «3rd»  : bc
Now going a bit crazy ..
 2nd-to-last(^2) «2nd» char to 3rd(2)         : bc
 2nd-to-last(^2) «2nd» char to last(^1) «3rd» : bc
 3rd-to-last(^3) «1st» char to 3rd(2)         : abc

Notes

  • It is recommended to always use a space around the .. operator to get consistent results (and no compilation errors!). Examples: [0 .. <str.high], [0 .. str.high], [0 .. ^2], [ .. ^2]. This is based on the tip by GitHub user Araq (also one of the core devs of Nim). You are find the full discussion around this topic of dots and spaces in Nim Issue #6216.

    Special ascii chars like % . & $ are collected into a single operator token. – Araq

  • To repeat: Always add a space around the .. operator.

All characters except first

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str[1:])
	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
# echo str[1 .. ] # Does not work.. Error: expression expected, but found ']'
# https://github.com/nim-lang/Nim/issues/6212
# Always add a space around the .. operator
echo str[1 .. str.high]
# or
echo str[1 .. ^1] # second(1) to last(^1)
	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx
	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx

All characters except first and last

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str[1:-1])
	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zy
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
# Always add a space around the .. operator
echo str[1 .. <str.high]
# or
echo str[1 .. ^2] # second(1) to second-to-last(^2)
	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zy
	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zy

Count

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.count('a'))
print(str.count('de'))
4
2
import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
echo str.count('a')
echo str.count("de")
4
2

Starts/ends with

Starts With

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.startswith('a'))
print(str.startswith('a\t'))
print(str.startswith('z'))
True
True
False
import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
echo str.startsWith('a') # Recommended Nim style
# or
echo str.startswith('a')
# or
echo str.starts_with('a')
echo str.startsWith("a\t")
echo str.startsWith('z')
true
true
true
true
false

Notes

  • All Nim identifiers are case and underscore insensitive (except for the first character of the identifier), as seen in the above example. So any of startsWith or startswith or starts_with would work the exact same way.
  • Though, it has to be noted that using the camelCase variant (startsWith) is preferred in Nim.

Ends With

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.endswith('x'))
print(str.endswith('yx'))
print(str.endswith('z'))
True
True
False
import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
echo str.endsWith('x')
echo str.endsWith("yx")
echo str.endsWith('z')
true
true
false

Expand Tabs

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.expandtabs())
print(str.expandtabs(4))
a       bc      def     aghij   cklm    danopqrstuv     adefwxyz        zyx
a   bc  def aghij   cklm    danopqrstuv adefwxyz    zyx
import strmisc
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
echo str.expandTabs()
echo str.expandTabs(4)
a       bc      def     aghij   cklm    danopqrstuv     adefwxyz        zyx
a   bc  def aghij   cklm    danopqrstuv adefwxyz    zyx

Find/Index

Find (from left)

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.find('a'))
print(str.find('b'))
print(str.find('c'))
print(str.find('zyx'))
print(str.find('aaa'))
0
2
3
41
-1
import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
echo str.find('a')
echo str.find('b')
echo str.find('c')
echo str.find("zyx")
echo str.find("aaa")
0
2
3
41
-1

Find from right

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.rfind('a'))
print(str.rfind('b'))
print(str.rfind('c'))
print(str.rfind('zyx'))
print(str.rfind('aaa'))
32
2
15
41
-1
import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
echo str.rfind('a')
echo str.rfind('b')
echo str.rfind('c')
echo str.rfind("zyx")
echo str.rfind("aaa")
32
2
15
41
-1

Index (from left)

From Python 3 docs,

Like find(), but raise ValueError when the substring is not found.

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.index('a'))
print(str.index('b'))
print(str.index('c'))
print(str.index('zyx'))
# print(str.index('aaa')) # Throws ValueError: substring not found
0
2
3
41

Nim does not have an error raising index function like that out-of-box, but something like that can be done with:

import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
# https://nim-lang.org/docs/strutils.html#find,string,string,Natural,Natural
# proc find(s, sub: string; start: Natural = 0; last: Natural = 0): int {..}
proc index(s, sub: auto; start: Natural = 0; last: Natural = 0): int =
  result = s.find(sub, start, last)
  if result<0:
    raise newException(ValueError, "$1 not found in $2".format(sub, s))

echo str.index('a')
echo str.index('b')
echo str.index('c')
echo str.index("zyx")
# echo str.index("aaa") # Error: unhandled exception: aaa not found in a	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx [ValueError]
0
2
3
41

Notes

  • No Nim equivalent, but I came up with my own index proc for Nim above.

Index from right

From Python 3 docs,

Like rfind(), but raise ValueError when the substring is not found.

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.rindex('a'))
print(str.rindex('b'))
print(str.rindex('c'))
print(str.rindex('zyx'))
# print(str.rindex('aaa')) # Throws ValueError: substring not found
32
2
15
41

Nim does not have an error raising rindex function like that out-of-box, but something like that can be done with:

import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
# https://nim-lang.org/docs/strutils.html#rfind,string,string,int
# proc rfind(s, sub: string; start: int = - 1): int {..}
proc rindex(s, sub: auto; start: int = -1): int =
  result = s.rfind(sub, start)
  if result<0:
    raise newException(ValueError, "$1 not found in $2".format(sub, s))

echo str.rindex('a')
echo str.rindex('b')
echo str.rindex('c')
echo str.rindex("zyx")
# echo str.rindex("aaa") # Error: unhandled exception: aaa not found in a	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx [ValueError]
32
2
15
41

Notes

  • No Nim equivalent, but I came up with my own rindex proc for Nim above.

Format

print('{} {}'.format(1, 2))
print('{} {}'.format('a', 'b'))
1 2
a b
import strutils
# echo "$1 $2" % [1, 2] # This gives error. % cannot have int list as arg, has to be string list
# echo "$1 $2" % ['a', 'b'] # This gives error. % cannot have char list as arg, has to be string list
echo "$1 $2" % ["a", "b"]
# or
echo "$1 $2".format(["a", "b"])
# or
echo "$1 $2".format("a", "b")
# or
echo "$1 $2".format('a', 'b') # format, unlike % does auto-stringification of the input
echo "$1 $2".format(1, 2)
a b
a b
a b
a b
1 2

Using strfmt module

import strfmt
echo "{} {}".fmt(1, 0)
echo "{} {}".fmt('a', 'b')
echo "{} {}".fmt("abc", "def")
echo "{0} {1} {0}".fmt(1, 0)
echo "{0.x} {0.y}".fmt((x: 1, y:"foo"))
1 0
a b
abc def
1 0 1
1 foo

Notes

  • Thanks to the tip by GitHub user TiberniumN, I can use the strfmt module to get Python .format()-like formatting function.
  • You will need to install it first by doing nimble install strfmt.
  • See this for full documentation on strfmt. The awesome thing is that it allows using a format syntax that’s similar to Python’s Format Specification Mini-Language 🙌

String Predicates

Is Alphanumeric?

print('abc'.isalnum())
print('012'.isalnum())
print('abc012'.isalnum())
print('abc012_'.isalnum())
print('{}'.isalnum())
print('Unicode:')
print('ábc'.isalnum())
True
True
True
False
False
Unicode:
True
import strutils
echo "abc".isAlphaNumeric()
echo "012".isAlphaNumeric()
echo "abc012".isAlphaNumeric()
echo "abc012_".isAlphaNumeric()
echo "{}".isAlphaNumeric()
echo "[Wrong] ", "ábc".isAlphaNumeric() # Returns false! isAlphaNumeric works only for ascii.
true
true
true
false
false
[Wrong] false

TODO Figure out how to write unicode-equivalent of isAlphaNumeric

Is Alpha?

print('abc'.isalpha())
print('012'.isalpha())
print('abc012'.isalpha())
print('abc012_'.isalpha())
print('{}'.isalpha())
print('Unicode:')
print('ábc'.isalpha())
True
False
False
False
False
Unicode:
True
import strutils except isAlpha
import unicode
echo "abc".isAlphaAscii()
echo "012".isAlphaAscii()
echo "abc012".isAlphaAscii()
echo "abc012_".isAlphaAscii()
echo "{}".isAlphaAscii()
echo "Unicode:"
echo unicode.isAlpha("ábc")
# or
echo isAlpha("ábc") # unicode prefix is not needed
                    # because of import strutils except isAlpha
# or
echo "ábc".isAlpha() # from unicode
true
false
false
false
false
Unicode:
true
true
true

Notes

  • Thanks to the tip from GitHub user dom96 on the use of except in import:

    import strutils except isAlpha
    import unicode

    That prevents the ambiguous call error like below as we are specifying that import everything from strutils, except for the isAlpha proc. Thus the unicode version of isAlpha proc is used automatically.

    nim_src_28505flZ.nim(14, 13) Error: ambiguous call; both strutils.isAlpha(s: string)[declared in lib/pure/strutils.nim(289, 5)] and unicode.isAlpha(s: string)[declared in lib/pure/unicode.nim(1416, 5)] match for: (string)

Is Digit?

print('abc'.isdigit())
print('012'.isdigit())
print('abc012'.isdigit())
print('abc012_'.isdigit())
print('{}'.isdigit())
False
True
False
False
False
import strutils
echo "abc".isDigit()
echo "012".isDigit()
echo "abc012".isDigit()
echo "abc012_".isDigit()
echo "{}".isDigit()
false
true
false
false
false

Is Lower?

print('a'.islower())
print('A'.islower())
print('abc'.islower())
print('Abc'.islower())
print('aBc'.islower())
print('012'.islower())
print('{}'.islower())
print('ABC'.islower())
print('À'.islower())
print('à'.islower())
True
False
True
False
False
False
False
False
False
True
import strutils except isLower
import unicode
echo 'a'.isLowerAscii()
echo 'A'.isLowerAscii()
echo "abc".isLowerAscii()
echo "Abc".isLowerAscii()
echo "aBc".isLowerAscii()
echo "012".isLowerAscii()
echo "{}".isLowerAscii()
echo "ABC".isLowerAscii()
echo "À".isLowerAscii()
echo "[Wrong] ", "à".isLowerAscii() # Returns false! As the name suggests, works only for ascii.
echo "À".isLower() # from unicode
echo "à".isLower() # from unicode
# echo "à".unicode.isLower() # Does not work, Error: undeclared field: 'unicode'
true
false
true
false
false
false
false
false
false
[Wrong] false
false
true

Notes

  1. isLower from strutils is deprecated. Use isLowerAscii instead, or isLower from unicode (as done above).
  2. To check if a non-ascii alphabet is in lower case, use unicode.isLower.

Is Upper?

print('a'.isupper())
print('A'.isupper())
print('abc'.isupper())
print('Abc'.isupper())
print('aBc'.isupper())
print('012'.isupper())
print('{}'.isupper())
print('ABC'.isupper())
print('À'.isupper())
print('à'.isupper())
False
True
False
False
False
False
False
True
True
False
import strutils except isUpper
import unicode
echo 'a'.isUpperAscii()
echo 'A'.isUpperAscii()
echo "abc".isUpperAscii()
echo "Abc".isUpperAscii()
echo "aBc".isUpperAscii()
echo "012".isUpperAscii()
echo "{}".isUpperAscii()
echo "ABC".isUpperAscii()
echo "[Wrong] ", "À".isUpperAscii() # Returns false! As the name suggests, works only for ascii.
echo "à".isUpperAscii()
echo "À".isUpper() # from unicode
echo "à".isUpper() # from unicode
# echo "À".unicode.isUpper() # Does not work, Error: undeclared field: 'unicode'
false
true
false
false
false
false
false
true
[Wrong] false
false
true
false

Notes

  1. isUpper from strutils is deprecated. Use isUpperAscii instead, or isUpper from unicode (as done above).
  2. To check if a non-ascii alphabet is in upper case, use unicode.isUpper.

Is Space?

print(''.isspace())
print(' '.isspace())
print('\t'.isspace())
print('\r'.isspace())
print('\n'.isspace())
print(' \t\n'.isspace())
print('abc'.isspace())
print('Testing with ZERO WIDTH SPACE unicode character below:')
print('[Wrong] {}'.format('​'.isspace())) # Returns false! That's, I believe, incorrect behavior.
False
True
True
True
True
True
False
Testing with ZERO WIDTH SPACE unicode character below:
[Wrong] False
import strutils except isSpace
import unicode
echo "".isSpaceAscii() # empty string has to be in double quotes
echo ' '.isSpaceAscii()
echo '\t'.isSpaceAscii()
echo '\r'.isSpaceAscii()
echo "\n".isSpaceAscii() # \n is a string, not a character in Nim
echo " \t\n".isSpaceAscii()
echo "abc".isSpaceAscii()
echo "Testing with ZERO WIDTH SPACE unicode character below:"
echo "[Wrong] ", "​".isSpaceAscii() # Returns false! As the name suggests, works only for ascii.
echo "​".isSpace() # from unicode
false
true
true
true
true
true
false
Testing with ZERO WIDTH SPACE unicode character below:
[Wrong] false
true

Notes

  1. Empty string results in a false result for both Python and Nim variants of isspace.
  2. \n is a string, not a character in Nim, because based on the OS, \n can comprise of one or more characters.
  3. isSpace from strutils is deprecated. Use isSpaceAscii instead, or isSpace from unicode (as done above).
  4. To check if a non-ascii alphabet is in space case, use unicode.isSpace.
  5. Interestingly, Nim’s isSpace from unicode module returns true for ZERO WIDTH SPACE unicode character (0x200b) as input, but Python’s isspace returns false. I believe Python’s behavior here is incorrect.

Is Title?

print(''.istitle())
print('this is not a title'.istitle())
print('This Is A Title'.istitle())
print('This Is À Title'.istitle())
print('This Is Not a Title'.istitle())
False
False
True
True
False
import unicode
echo "".isTitle()
echo "this is not a title".isTitle()
echo "This Is A Title".isTitle()
echo "This Is À Title".isTitle()
echo "This Is Not a Title".isTitle()
false
false
true
true
false

Join

print(' '.join(['a', 'b', 'c']))
print('xx'.join(['a', 'b', 'c']))
a b c
axxbxxc
import strutils
echo "Sequences:"
# echo @["a", "b", "c"].join(' ') # Error: type mismatch: got (seq[string], char)
echo @["a", "b", "c"].join(" ")
echo join(@["a", "b", "c"], " ")
echo "Lists:"
# echo ["a", "b", "c"].join(" ") # Does not work, Error: cannot instantiate: echo["a", "b", "c"]; got 3 type(s) but expected 0
echo (["a", "b", "c"].join(" ")) # Works!
echo join(["a", "b", "c"], " ") # Works!
var list = ["a", "b", "c"]
echo list.join(" ") # Works too!
echo @["a", "b", "c"].join("xx")
Sequences:
a b c
a b c
Lists:
a b c
a b c
a b c
axxbxxc

Notes

  1. The second arg to join, the separator argument has to be a string, cannot be a character.
  2. echo ["a", "b", "c"].join(" ") does not work, but echo (["a", "b", "c"].join(" ")) works – Issue # 6210.

Justify with filling

Center Justify with filling

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.center(80))
print(str.center(80, '*'))
                  a	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx
******************a	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx******************
import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
echo str.center(80)
echo str.center(80, '*')
# or
echo center(str, 80, '*')
                  a	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx
******************a	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx******************
******************a	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx******************

Left Justify with filling

print('abc'.ljust(2, '*'))
print('abc'.ljust(10, '*'))
abc
abc*******
import strutils
proc ljust(s: string; count: Natural; padding = ' '): string =
  result = s
  let strlen: int = len(s)
  if strlen < count:
    result.add(padding.repeat(count-strlen))

echo "abc".ljust(2, '*')
echo "abc".ljust(10, '*')
abc
abc*******

Notes

  • No Nim equivalent, but I came up with my own ljust proc for Nim above.

Right Justify with filling

print('abc'.rjust(10, '*'))
*******abc
import strutils
echo "abc".align(10, '*')
*******abc

Zero Fill

print('42'.zfill(5))
print('-42'.zfill(5))
print(' -42'.zfill(5))
00042
-0042
0 -42
import strutils
echo "Using align:"
echo "42".align(5, '0')
echo "-42".align(5, '0')

echo "Using zfill:"
proc zfill(s: string; count: Natural): string =
  let strlen: int = len(s)
  if strlen < count:
    if s[0]=='-':
      result = "-"
      result.add("0".repeat(count-strlen))
      result.add(s[1 .. s.high])
    else:
      result = "0".repeat(count-strlen)
      result.add(s)
  else:
    result = s

echo "42".zfill(5)
echo "-42".zfill(5)
echo " -42".zfill(5)
Using align:
00042
00-42
Using zfill:
00042
-0042
0 -42
Notes
  • The align in Nim does not do the right thing as the Python zfill does when filling zeroes on the left in strings representing negative numbers.
  • No Nim equivalent, but I came up with my own zfill proc for Nim above.

Case conversion

To Lower

print('a'.lower())
print('A'.lower())
print('abc'.lower())
print('Abc'.lower())
print('aBc'.lower())
print('012'.lower())
print('{}'.lower())
print('ABC'.lower())
print('À'.lower())
print('à'.lower())
a
a
abc
abc
abc
012
{}
abc
à
à
import strutils except toLower
import unicode
echo 'a'.toLowerAscii()
echo 'A'.toLowerAscii()
echo "abc".toLowerAscii()
echo "Abc".toLowerAscii()
echo "aBc".toLowerAscii()
echo "012".toLowerAscii()
echo "{}".toLowerAscii()
echo "ABC".toLowerAscii()
echo "[Wrong] ", "À".toLowerAscii() # Does not work! As the name suggests, works only for ascii.
echo "à".toLowerAscii()
echo "À".toLower() # from unicode
echo "à".toLower() # from unicode
a
a
abc
abc
abc
012
{}
abc
[Wrong] À
à
à
à

Notes

  1. toLower from strutils is deprecated. Use toLowerAscii instead, or toLower from unicode (as done above).
  2. To convert a non-ascii alphabet to lower case, use unicode.toLower.

To Upper

print('a'.upper())
print('A'.upper())
print('abc'.upper())
print('Abc'.upper())
print('aBc'.upper())
print('012'.upper())
print('{}'.upper())
print('ABC'.upper())
print('À'.upper())
print('à'.upper())
A
A
ABC
ABC
ABC
012
{}
ABC
À
À
import strutils except toUpper
import unicode
echo 'a'.toUpperAscii()
echo 'A'.toUpperAscii()
echo "abc".toUpperAscii()
echo "Abc".toUpperAscii()
echo "aBc".toUpperAscii()
echo "012".toUpperAscii()
echo "{}".toUpperAscii()
echo "ABC".toUpperAscii()
echo "À".toUpperAscii()
echo "[Wrong] ", "à".toUpperAscii() # Does not work! As the name suggests, works only for ascii.
echo "À".toUpper() # from unicode
echo "à".toUpper() # from unicode
A
A
ABC
ABC
ABC
012
{}
ABC
À
[Wrong] à
À
À

Notes

  1. toUpper from strutils is deprecated. Use toUpperAscii instead, or toUpper from unicode (as done above).
  2. To convert a non-ascii alphabet to upper case, use unicode.toUpper.

Capitalize

str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
print(str.capitalize())
A	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx
import strutils
var str = "a\tbc\tdef\taghij\tcklm\tdanopqrstuv\tadefwxyz\tzyx"
echo str.capitalizeAscii
# or
echo capitalizeAscii(str)
A	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx
A	bc	def	aghij	cklm	danopqrstuv	adefwxyz	zyx

To Title

print('convert this to title á û'.title())
Convert This To Title Á Û
import unicode
echo "convert this to title á û".title()
Convert This To Title Á Û

Swap Case

print('Swap CASE example á û Ê'.swapcase())
print('Swap CASE example á û Ê'.swapcase().swapcase())
sWAP case EXAMPLE Á Û ê
Swap CASE example á û Ê
import unicode
echo "Swap CASE example á û Ê".swapcase()
echo "Swap CASE example á û Ê".swapcase().swapcase()
sWAP case EXAMPLE Á Û ê
Swap CASE example á û Ê

Notes

  • See this SO Q/A to read about few cases where s.swapcase().swapcase()==s is not true (at least for Python).

Strip

Left/leading and right/trailing Strip

print('«' + '   spacious   '.strip() + '»')
print('www.example.com'.strip('cmowz.'))
print('mississippi'.strip('mipz'))
«spacious»
example
ssiss
import strutils
echo "«", "   spacious   ".strip(), "»"
echo "www.example.com".strip(chars={'c', 'm', 'o', 'w', 'z', '.'})
echo "mississippi".strip(chars={'m', 'i', 'p', 'z'})
«spacious»
example
ssiss

Notes

  • Python strip takes a string as an argument to specify the letters that need to be stripped off the input string. But Nim strip requires a Set of characters.

Left/leading Strip

print('«' + '   spacious   '.lstrip() + '»')
print('www.example.com'.lstrip('cmowz.'))
print('mississippi'.lstrip('mipz'))
«spacious   »
example.com
ssissippi
import strutils
echo "«", "   spacious   ".strip(trailing=false), "»"
echo "www.example.com".strip(trailing=false, chars={'c', 'm', 'o', 'w', 'z', '.'})
echo "mississippi".strip(trailing=false, chars={'m', 'i', 'p', 'z'})
«spacious   »
example.com
ssissippi

Right/trailing Strip

print('«' + '   spacious   '.rstrip() + '»')
print('www.example.com'.rstrip('cmowz.'))
print('mississippi'.rstrip('mipz'))
«   spacious»
www.example
mississ
import strutils
echo "«", "   spacious   ".strip(leading=false), "»"
echo "www.example.com".strip(leading=false, chars={'c', 'm', 'o', 'w', 'z', '.'})
echo "mississippi".strip(leading=false, chars={'m', 'i', 'p', 'z'})
«   spacious»
www.example
mississ

Partition

First occurrence partition

print('ab:ce:ef:ce:ab'.partition(':'))
print('ab:ce:ef:ce:ab'.partition('ce'))
('ab', ':', 'ce:ef:ce:ab')
('ab:', 'ce', ':ef:ce:ab')
import strmisc
echo "ab:ce:ef:ce:ab".partition(":") # The argument is a string, not a character
echo "ab:ce:ef:ce:ab".partition("ce")
(Field0: ab, Field1: :, Field2: ce:ef:ce:ab)
(Field0: ab:, Field1: ce, Field2: :ef:ce:ab)

Right partition or Last occurrence partition

print('ab:ce:ef:ce:ab'.rpartition(':'))
print('ab:ce:ef:ce:ab'.rpartition('ce'))
('ab:ce:ef:ce', ':', 'ab')
('ab:ce:ef:', 'ce', ':ab')
import strmisc
echo "ab:ce:ef:ce:ab".rpartition(":") # The argument is a string, not a character
# or
echo "ab:ce:ef:ce:ab".partition(":", right=true)
echo "ab:ce:ef:ce:ab".rpartition("ce")
# or
echo "ab:ce:ef:ce:ab".partition("ce", right=true)
(Field0: ab:ce:ef:ce, Field1: :, Field2: ab)
(Field0: ab:ce:ef:ce, Field1: :, Field2: ab)
(Field0: ab:ce:ef:, Field1: ce, Field2: :ab)
(Field0: ab:ce:ef:, Field1: ce, Field2: :ab)

Replace

print('abc abc abc'.replace(' ab', '-xy'))
print('abc abc abc'.replace(' ab', '-xy', 0))
print('abc abc abc'.replace(' ab', '-xy', 1))
print('abc abc abc'.replace(' ab', '-xy', 2))
abc-xyc-xyc
abc abc abc
abc-xyc abc
abc-xyc-xyc
import strutils
echo "abc abc abc".replace(" ab", "-xy")
# echo "abc abc abc".replace(" ab", "-xy", 0) # Invalid, does not expect a count:int argument
# echo "abc abc abc".replace(" ab", "-xy", 1) # Invalid, does not expect a count:int argument
# echo "abc abc abc".replace(" ab", "-xy", 2) # Invalid, does not expect a count:int argument
abc-xyc-xyc

Notes

  • Nim does not allow specifying the number of occurrences to be replaced using a count argument as in the Python version of replace.

Split

Split (from left)

print('1,2,3'.split(','))
print('1,2,3'.split(',', maxsplit=1))
print(' {}'.format('1,2,3'.split(',', maxsplit=1)[0]))
print(' {}'.format('1,2,3'.split(',', maxsplit=1)[1]))
print('1,2,,3,'.split(','))
print('1::2::3'.split('::'))
print('1::2::3'.split('::', maxsplit=1))
print('1::2::::3::'.split('::'))
['1', '2', '3']
['1', '2,3']
 1
 2,3
['1', '2', '', '3', '']
['1', '2', '3']
['1', '2::3']
['1', '2', '', '3', '']
import strutils
echo "1,2,3".split(',')
echo "1,2,3".split(',', maxsplit=1)
echo " ", "1,2,3".split(',', maxsplit=1)[0]
echo " ", "1,2,3".split(',', maxsplit=1)[1]
echo "1,2,,3,".split(',')
echo "1::2::3".split("::")
echo "1::2::3".split("::", maxsplit=1)
echo "1::2::::3::".split("::")
@[1, 2, 3]
@[1, 2,3]
 1
 2,3
@[1, 2, , 3, ]
@[1, 2, 3]
@[1, 2::3]
@[1, 2, , 3, ]

Split from right

rsplit behaves just like split unless the maxsplit argument is given

print('1,2,3'.rsplit(','))
print('1,2,3'.rsplit(',', maxsplit=1))
print(' {}'.format('1,2,3'.rsplit(',', maxsplit=1)[0]))
print(' {}'.format('1,2,3'.rsplit(',', maxsplit=1)[1]))
print('1,2,,3,'.rsplit(','))
print('1::2::3'.rsplit('::'))
print('1::2::3'.rsplit('::', maxsplit=1))
print('1::2::::3::'.rsplit('::'))
['1', '2', '3']
['1,2', '3']
 1,2
 3
['1', '2', '', '3', '']
['1', '2', '3']
['1::2', '3']
['1', '2', '', '3', '']
import strutils
echo "1,2,3".rsplit(',')
echo "1,2,3".rsplit(',', maxsplit=1)
echo " ", "1,2,3".rsplit(',', maxsplit=1)[0]
echo " ", "1,2,3".rsplit(',', maxsplit=1)[1]
echo "1,2,,3,".rsplit(',')
echo "1::2::3".rsplit("::")
echo "1::2::3".rsplit("::", maxsplit=1)
echo "1::2::::3::".rsplit("::")
@[1, 2, 3]
@[1,2, 3]
 1,2
 3
@[1, 2, , 3, ]
@[1, 2, 3]
@[1::2, 3]
@[1, 2, , , 3, ]

Split Lines

splits = 'ab c\n\nde fg\rkl\r\n'.splitlines()
print(splits)
for i, split in enumerate(splits):
    print(' {}: {}'.format(i, split))
print('ab c\n\nde fg\rkl\r\n'.splitlines(keepends=True))
['ab c', '', 'de fg', 'kl']
 0: ab c
 1:
 2: de fg
 3: kl
['ab c\n', '\n', 'de fg\r', 'kl\r\n']
import strutils
var splits: seq[string] = "ab c\n\nde fg\rkl\r\n".splitLines()
echo splits
for i, split in splits:
  echo " ", i, ": ", split
@[ab c, , de fg, kl, ]
 0: ab c
 1:
 2: de fg
 3: kl
 4:

Notes

  • The Nim version of splitLines does not have a second argument like keepends in the Python version splitlines.
  • Also the Nim version creates separate splits for the \r and \n. Compare the number of splits created in Python vs Nim in the example above.

Convert

See the encodings module for equivalents of Python decode and encode functions.

Others

There is no equivalent for the Python translate function , in Nim as of writing this (<2017-08-09 Wed>).

References

Comments

comments powered by Remarkbox