Like the topic, is there a (quick) way, possibly a notation, to achieve the same effect as in turbo pascal to rapid make a list of all elements containing and between 'A' and 'Z'.
In turbo pascal it could be written as ['A'..'Z']
Like the topic, is there a (quick) way, possibly a notation, to achieve the same effect as in turbo pascal to rapid make a list of all elements containing and between 'A' and 'Z'.
In turbo pascal it could be written as ['A'..'Z']
On
I think the most elegant, simple and pythonic way is to use string module:
import string
print(string.ascii_uppercase)
>>> 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
x = list(string.ascii_uppercase)
print(x)
>>> ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']
EDIT: if you need alphabet for languages containing non-ASCII or even non-Latin characters you can use PyICU (here's a guide how to install for Windows users: https://github.com/cgohlke/pyicu-build). Then, you are able to run this script:
# imports
import locale
from icu import Collator, Locale, LocaleData
# official language name
locale_language, encoding = locale.getlocale()
# alphabet lowercase
locale_alphabet: list = list(LocaleData(locale_language).getExemplarSet())
# alphabet uppercase
locale_alphabet_uppercase: list = list(map(str.upper, locale_alphabet))
# output
print(locale_alphabet_uppercase)
The thing is PyICU generates unsorted list of characters (in my case - for Polish):
>>> ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'R', 'S', 'T', 'U', 'W', 'Y', 'Z', 'Ó', 'Ą', 'Ć', 'Ę', 'Ł', 'Ń', 'Ś', 'Ź', 'Ż']
To correctly sort letters according to your language standard, you can specify custom sorting key using Collator from PyICU (https://stackoverflow.com/a/11124645/11485896):
# sorting
collator = Collator.createInstance(Locale(locale_language))
# output
print(sorted(locale_alphabet_uppercase, key = collator.getSortKey))
Output:
>>> ['A', 'Ą', 'B', 'C', 'Ć', 'D', 'E', 'Ę', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'Ł', 'M', 'N', 'Ń', 'O', 'Ó', 'P', 'R', 'S', 'Ś', 'T', 'U', 'W', 'Y', 'Z', 'Ź', 'Ż']
On
Unfortunately, there's no other way to do this in Python that's as compact and elegant. As @Soren said, for letters, you can use string.ascii_uppercase or string.ascii_lowercase, on which you can do slicing. For instance, print(string.ascii_uppercase[3:7]) prints "DEFG".
If you want something more generic, more readable, and not limited to latin alphabet letters, you have to write a class to approach what you want. I wrote this very simple example as a proof of concept (it misses many little details). It should work with any iterable (even though I've only tested it with strings).
class Slicer:
def __init__(self, content):
self.content = content
def __getitem__(self, key: slice):
if key.step is not None:
return self.content[self.content.index(key.start) : self.content.index(key.stop)+1 : key.step]
return self.content[self.content.index(key.start) : self.content.index(key.stop)+1]
def __repr__(self):
return f"Slicer({self.content!r})"
import string
letters = Slicer(string.ascii_uppercase)
print(letters)
# Slicer('ABCDEFGHIJKLMNOPQRSTUVWXYZ')
print(letters["A":"H"])
# ABCDEFGH
print(letters["A":"H":2])
# ACEG
If you want to dig further, it uses the slice object in Python, representing sequences of objects: https://docs.python.org/3/glossary.html#term-slice
For numbers, you can just use the range() function: list(range(1, 7)) returns [1, 2, 3, 4, 5, 6] (it also supports steps).
On
If you have icu4c and PyICU available, it is possible to construct a list of characters using Unicode sets:
from icu import UnicodeSet
chars = list(UnicodeSet('[A-Z]'))
print(chars)
# ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']
Using Unicode sets, a lot more sophisticated and complex sets can be developed. For instance, all uppercase Latin script letters:
upper_latin = list(UnicodeSet('[[\p{Lu}] & [\p{Script=Latn}]]'))
You can use map on a range of character numbers:
You can combine multiple ranges using unpacking:
Alternatively you could create a shorthand function of you own:
Which you can reuse as needed: