Python Programming
First steps
What is Python?
- A snake.
- A British comedy group called Monty Python.
- A programming language. The definition of the language: words, punctuation (operators) and grammar (syntax).
- The compiler/interpreter of the Python programming language. (aka. CPython).
When people say they Python in relation to programming they either mean the Python programming language or they mean the tool that can translate some text (code) written in the Python programming language to the language a computer can actually understand. On MS Windows this is the python.exe you need to install. On Linux/Mac it is usually called python or python3. The generic name of the tool that translates a programming language for the computer is either called a compiler or an interpreter. We'll talk about this later on.
What is needed to write a program?
- An editor where we can write in a language.
- A compiler or interpreter that can translate our text to the language of the computer.
In order to write and run a program you basically need two things. A text editor in which you can write the program and a compiler or interpreter that can translate this program to the computer.
The source (code) of Python
Python 2 vs. Python 3
-
Python 2.x - old, legacy code at companies, answers on the Internet. Retires on January 1, 2020.
-
Python 3.x - the one that you should use. (not fully backward compatible) Available since December 3, 2008.
Python has two major lines the version 2.x and the version 3.x. In a nutshell you should always use Python 3 if possible.
Unfortunately you can still encounter many companies and many projects in companies that are stuck on Python 2. In such cases you probably will have to write in Python 2.
In addition when you search for solutions on the Internet in many cases you'll encounter solution that were written for Python 2. Luckily in most of the cases it is almost trivial to convert these small examples to work on Python 3. You just need to be able to recognize that the code was originally written for Python 2 and you need to be able to make the adjustments.
For this reason, while the majority of these pages cover Python 3, we are going to point out the places where it might be useful to know how Python 2 works.
You are free to skip these parts and come back to them when the need arises.
Installation
- MS Windows
- Linux
- Apple/Mac OSX
We are going to cover how to install Python on all 3 major operating systems.
Installation on Linux
- On Linux you usually have Python 2 installed in /usr/bin/python
- Python 3 in /usr/bin/python3.
- If they are not installed, you can install them with the appropriate yum or apt-get command of your distribution.
- An alternative is to install Anaconda with Python 3.x
$ which python3
$ sudo apt-get install python3
$ sudo yum install python3
Installation on Apple Mac OSX
- On Mac OSX you can have Python 2 installed in /usr/bin/python and Python 3 installed as /usr/bin/python3.
- Homebrew
- An alternative is to install Anaconda with Python 3.x
$ which python3
$ brew install python3
Installation on MS Windows
-
Make sure the "Add Python 3.10 to PATH" check-box is checked.
Alternatively, if Python was installed without that checkbox, one can re-run the installation, select "Modify installation" and then check the box on "Add Python to environment variables".
Installation of Anaconda
Anaconda is a package that includes Python and a bunch of other tools. I used to recommend it, but these days I prefer a plain installation of Python from python.org.
- Anaconda with Python 3.x
- Anaconda shell
- Anaconda Jupyter notebook
Editors, IDEs
Basically you can use any text editor to write Python code. The minimum I recommend is to have proper syntax highlighting. IDEs will also provide intellisense, that is, in most of the cases they will be able to understand what kind of objects do you have in your code and will be able to show you the available methods and their parameters. Even better, they provide powerful debuggers.
PyCharm seems to be the most popular IDE. It has a free version called community edition.
Linux
Windows
Mac
- CotEditor
- TextWrangler
- TextMate
- Type "text editor" in your Apple Store (filter to free)
All platforms
- Sublime Text (commercial)
- Ligth Table
IDEs
Documentation
Program types
- Desktop application (MS Word, MS Excel, calculator, Firefox, Chrome, ...
- Mobile applications - whatever runs on your phone.
- Embedded applications - software in your car or in your shoelace.
- Web applications - they run on the web server and send you HTML that your browser can show.
- Command Line Applications
- Scripts and programs are the same for our purposes
- ...
Python on the command line
- -V|options
- -c|options
More or less the only thing I do on the command line with python is to check the version number:
python -V
python --version
You can run some Python code without creating a file, but I don't remember ever needing this. If you insist
python -c "print 42"
python3 -c "print(42)"
Type the following to get the details:
man python
First script - hello world
print("Hello World")
- Create a file called hello.py with the above content.
- Open your terminal or the Anaconda Prompt on MS Windows in the directory (folder)
- Change to the directory where you saved the file.
- Run it by typing python hello.py or python3 hello.py
- The extension is .py - mostly for the editor (but also for modules).
- Parentheses after print() are required in Python 3, but use them even if you are stuck on Python 2.
Examples
- The examples are on GitHub
- You can download them and unzip them or you can clone them using
git clone https://github.com/szabgab/slides.git
'slides'... fatal: unable to access 'https://github.com/szabgab/slides.git/':
SSL certificate problem: self signed certificate in certificate chain
Sometimes people get an error:
The soulution is then to do the following: (on Windows)
set GIT_SSL_NO_VERIFY=true
git clone https://github.com/szabgab/slides.git
Later, after I update the slides you can also update your local copy of the files by running
cd slides
git pull
Comments
# marks single line comments.
There are no real multi-line comments in Python, but we will see a way to have them anyway.
print("hello")
# Comments for other developers
print("world") # more comments
# print("This is not printed")
Variables
greeting = "Hello World!"
print(greeting)
Exercise: Hello world
Try your environment:
- Make sure you have access to the right version of Python.
- Install Python if needed.
- Check if you have a good editor with syntax highlighting.
- Write a simple script called hello.py that prints Hello Foo Bar! replacing Foo Bar with your own name.
- Add some comments to your code.
- Create a variable, assign some text to it and then print out the content of the variable.
What is programming?
- Use some language to tell the computer what to do.
- Like a cooking recipe it has step-by-step instructions.
- Taking a complex problem and dividing it into small steps a computer can do.
What are the programming languages
- A computer CPU is created from transistors, 1 and 0 values. (aka. bits)
- Its language consists of numbers. (e.g 37 means move the content of ax register to bx register)
- English? too complex, too much ambiguity.
- Programming languages are in-between.
A written human language
- Words
- Punctuation: - . , ! ?
- Grammar
- ...
A programming language
- Built-in words: print, len, type, def, ...
- Literal values: numbers, strings
- Operators: + - * = , ; ...
- Grammar (syntax)
- User-created words: variables, functions, classes, ...
Words and punctuation matter!
-
What did you chose? (Correctly: choose, but people will usually understand.)
-
Lets do the homework. (Correctly: Let's, but most people will understand.)
-
Let's eat, grandpa!
-
Let's eat grandpa!
-
Programming languages have a lot less words, but they are very strict on the grammar (syntax).
-
A missing comma can break your code.
-
A missing space will change the meaning of your code.
-
An incorrect word can ruin your day.
Types matter to Python (a bit)
- Python differntiates between strings, integers, and floating point numbers.
- "2" is not the same as 2
- "3.14" is not the same as 3.14
String vs int
x = 2
y = "2"
print(x)
print(y)
print(x + 1)
print(y + 1)
Output:
2
2
3
Traceback (most recent call last):
File "/home/gabor/work/slides/python/examples/basics/str_int.py", line 9, in <module>
print(y + 1)
TypeError: can only concatenate str (not "int") to str
String vs float
x = 3.14
y = "3.14"
print(x)
print(y)
print(x + 1.1)
print(y + 1.1)
Output:
3.14
3.14
4.24
Traceback (most recent call last):
File "/home/gabor/work/slides/python/examples/basics/str_float.py", line 9, in <module>
print(y + 1.1)
TypeError: can only concatenate str (not "float") to str
int and float
x = 2
y = 3.14
print(x + 1.5)
print(y + 1)
Output:
3.5
4.140000000000001
Literals, Value Types in Python
- int
- str
- float
- bool
print( type(23) ) # int
print( type(3.14) ) # float
print( type("hello") ) # str
print( type("23") ) # str
print( type("3.24") ) # str
print( type(None) ) # NoneType
print( type(True) ) # bool
print( type(False) ) # bool
print( type([]) ) # list
print( type({}) ) # dict
print( type(hello) ) # NameError: name 'hello' is not defined
print("Still running")
Output:
Traceback (most recent call last):
File "python/examples/basics/types.py", line 15, in <module>
print( type(hello) ) # str
NameError: name 'hello' is not defined
- Strings must be enclosed in quotes.
- Numbers must be NOT enclosed in quotes.
Floating point limitation
print(0.1 + 0.2) # 0.30000000000000004
x = 0.1 + 0.2
y = 0.3
print(x) # 0.30000000000000004
print(y) # 0.3
if x == y:
print("They are equal")
else:
print("They are NOT equal")
Floating point -compare using round
- round
x = 0.1 + 0.2
y = 0.3
print(x) # 0.30000000000000004
print(y) # 0.3
print(round(x, 10))
if round(x, 10) == round(y, 10):
print("They are equal")
else:
print("They are NOT equal")
round
- round
pi = 3.141592653589793
print(pi) # 3.141592653589793
print(round(pi, 10)) # 3.1415926536
print(round(pi, 5)) # 3.14159
print(round(pi, 2)) # 3.14
Value Types in Numpy
Numpy but also other programming languages might have them.
- int8
- int32
- float32
- float64
- ...
Rectangle (numerical operations)
- =
-
In this example we create two variables width
and height
containing the numbers 23 and 17 respectively.
Unlike in math, in programming in general where you see a single equal sign =
it means assignment. It means we want the value on the right-hand-side to be in the variable on the left-hand-side.
Others might say make the word/name on the left-hand-side of the =
sign refer to the value that is on the right-hand-side.
In any case this is not a mathematical statement of truth not an equation, but a statement of an action.
On the next line we multiply the values in two already existing variable and assign the result to a third variable called area
.
At the end we use the print
function that we have already seen, to print out the results on the screen.
A simple mathematical operation.
width = 23
height = 17
area = width * height
print(area) # 391
Multiply string
What if we put the two numbers into quotation marks and this make them strings? Strings that look like number to the naked eyes, but nevertheless are strings for Python.
If we try to multiply them we get a nasty exception. Also known as a runtime error. The program stops running.
These exceptions might look nasty, but they are our friends. They tell us what went wrong and exactly where did that happen.
You just need to remember that, at least in Python, you need to read the whole thing from the bottom to top. The last line holds the error message. Above that you can usually see the content of the line where the problem was found. One line above that you'll see the name of the file and the line number where the problem occurred.
I strongly urge you to read the error message. If it is not yet clear what is the problem, then copy it to your favorite search engine and read the explanations you find.
Eventually you'll learn to recognize these messages much faster and it will be much easier to fix the problems.
What this current error message means is we tried to multiply two strings and Python cannot do that.
width = "23"
height = "17"
area = width * height
print(area)
Output:
Traceback (most recent call last):
File "python/examples/basics/rectangular_strings.py", line 3, in <module>
area = width * height
TypeError: can't multiply sequence by non-int of type 'str'
Add numbers
OK, so we know how to multiply two numbers. Let's now take a giant leap and try to add two numbers together.
It works as expected. We can move on to the next challenge.
a = 19
b = 23
c = a + b
print(c) # 42
Add strings
- concatenation
You guessed right, we now wrap the number in quotes and try to add them together.
Surprisingly it works. Though the result is a bit strange at first. As if Python put one string after the other.
Indeed the +
operator is defined when we have two strings on the two sides. It is then called concatenation.
In general you'll have to learn what the mathematical operators do when they are applied to values other than numbers. Usually the operation they do is quite logical. You just need to find the right logic.
a = "19"
b = "23"
c = a + b
print(c) # 1923
d = b + a
print(d) # 2319
Exercise: Calculations
- Extend the
examples/basics/rectangle_basic.py
file from the earlier example to print both the area and the circumference of the rectangle. - Write a script called basic_circle.py that has a variable holding the radius of a circle and prints out the area of the circle and the circumference of the circle.
- Write a script called basic_calc.py that has two numbers a and b and prints out the results of a+b, a-b, a*b, a/b
Solution: Calculations
- math
- pi
In order to have the math operation work properly we had to put the addition in parentheses. Just as you would in math class.
width = 23
height = 17
area = width * height
print("The area is ", area) # 391
circumference = 2 * (width + height)
print("The circumference is ", circumference) # 80
In order to calculate the area and the circumference of a circle we need to have PI
so we created a variable called
pi
and put in 3.14 which is a very rough estimation. You might want to have a more exact value of PI.
r = 7
pi = 3.14
print("The area is ", r * r * pi) # 153.86
print("The circumference is ", 2 * r * pi) # 43.96
Python has lots of modules (aka. libraries, aka. extensions), extra code that you can import and start using.
For example it has a module called math
that provides all kinds of math-related functions and attributes.
A function does something, an attribute just hold some value. More about this later.
Specifically it has an attribute you can call math.pi
with the value 3.141592653589793
. A much better proximation of PI.
In the following solution we used that.
- The documentation of the math module.
import math
r = 7
print("The area is ", r * r * math.pi) # 153.9380400258998
print("The circumference is ", 2 * r * math.pi) # 43.982297150257104
The expression r * r
might also bothered your eyes. Well don't worry in Python there is an operator to express exponential values
It is the double star: **
. This is how we can use it to say r-square: r ** 2
.
r = 7
pi = 3.14
print("The area is ", r ** 2 * pi) # 153.86
print("The circumference is ", 2 * r * pi) # 43.96
I don't have much to say about the calculator. I think it is quite straight forward.
a = 3
b = 2
print(a+b) # 5
print(a-b) # 1
print(a*b) # 6
print(a/b) # 1.5
Second steps
Modules
When we program in Python we basically have 3 main pieces. The base-language itself. A set of standard modules. A set of 3rd party modules.
All the modules provide additional functionality to the base-language and without them we would not be able to do much. The standard modules
come installed with Python, the 3rd party modules we need to install. Once installed however they behave in the same way. We need to import
them and then we can use them. We'll discuss these even more later, but we already would like to use some so let's see some basic ideas.
I know we already used the math
module in the solution of the earlier exercises, but some people might have missed those.
In this example we import
the sys
module that contains various attributes and operations related to the Python system. (There is another module
called os
that provides functionality related to the Operating System.)
A few examples:
The executable
attribute pointing to where the currently running Python executable is located. On MS Windows this will be a path to a python.exe file.
platform
is going to be win32
on any Windows machine.
We are going to discuss the whole sys.argv
thing a lot more, but for now look sys.argv[0]
contains path to the current Python file.
sys.version_info
contains the version information about the currently running Python.
Specifically sys.version_info.major
contains the major version number which 3 for Python 3 and 2 for Python 2.
If really needed, you could use this to recognize when someone is trying to run your program on an unsupported version of Python.
These were all attributes that contain some fixed value.
There is also the getsizeof
function that comes with the sys
module. You know it is a function because you see a pair of parentheses
at the end. The attributes above did not have parentheses. Functions do something. This specific function calculates the number of bytes
being used by an object.
You can see an integer (both 1 and 42) use 28 bytes.
A floating point number uses 24 bytes.
An empty string uses 49 bytes.
Then each character takes another byte. (Actually this is only true in the case of Latin letters, but let's not get ahead of ourselves.)
- The documentation of the sys module.
import sys
print( sys.executable ) # /home/gabor/venv3/bin/python
print( sys.platform ) # linux
print( sys.argv[0] ) # examples/basics/modules.py
print( sys.version_info.major ) # 3
print( sys.getsizeof( 1 ) ) # 28
print( sys.getsizeof( 42 ) ) # 28
print( sys.getsizeof( 1.0 ) ) # 24
print( sys.getsizeof( "" ) ) # 49
print( sys.getsizeof( "a" ) ) # 50
print( sys.getsizeof( "ab" ) ) # 51
print( sys.getsizeof( "abcdefghij" ) ) # 59
A main function
- main
- def
You could write your code in the main body of your Python file, but using functions and passing arguments to it will make your code easier to maintain and understand. Therefore I recommend that you always write every script with a function called "main".
- Function definition starts with the def keyword, followed by the name of the new function ("main" in this case), followed by the list of parameters in parentheses (nothing in this case).
- The content or body of the function is then indented to the right.
- The function definition ends when the indentation stops.
If you execute this file you might be surprised that nothing happens. This is so because we only defined the function, we never used it. We'll do that next.
def main():
print("Hello")
print("World")
This won't run as the main function is declared, but it is never called (invoked).
The main function - called
- main
- def
In this example I added 3 lines to the previous file. The line main()
calls the main function. We sometimes also say "runs the function" or "invokes the function".
In this context they mean the same.
The two print-statements are not necessary to call the function, I only added them so it will be easy to see the order of the operations that you can observe by looking at the output.
def main():
print("Hello")
print("World")
print("before")
main()
print("after")
before
Hello
World
after
- Use a main function to avoid globals and better structure your code.
- Python uses indentation for blocks instead of curly braces, and it uses the colon : to start a block.
Indentation
-
indentation
-
Standard recommendations: 4 spaces on every level.
Conditional main
- main
- name
def main():
print("Hello World")
if __name__ == "__main__":
main()
- We'll cover this later but in case you'd like, you can include this conditional execution of the main function.
Input - Output I/O
Input
- Keyboard (Standard Input, Command line, GUI)
- Mouse (Touch pad)
- Touch screen
- Files, Filesystem
- Network (e.g. in Web applications)
Output
- Screen
- File
- Network
print in Python 2
print is one of the keywords that changed between Python 2 and Python 3. In Python 2 it does not need parentheses, in Python 3 it is a function and it needs to have parentheses.
print "hello"
print "world"
print "Foo", "Bar"
hello
world
Foo Bar
print "hello",
print "world"
print "Foo", "Bar"
hello world
Foo Bar
No newline, but a space is added at the end of the output and between values.
import sys
sys.stdout.write("hello")
sys.stdout.write("world")
helloworld
write takes exactly one parameter
print in Python 3
- end
- sep
print("hello")
print("world")
print("Foo", "Bar")
hello
world
Foo Bar
print("hello", end=" ")
print("world")
print("Foo", "Bar")
print("hello", end="")
print("world")
print("hello", end="-")
print("world")
hello world
Foo Bar
helloworld
hello-world
end will set the character added at the end of each print statement.
print("hello", end="")
print("world")
print("Foo", "Bar", sep="")
print("END")
helloworld
FooBar
END
sep
will set the character separating values.
print in Python 2 as if it was Python 3
- future
- print_function
from __future__ import print_function
print("hello", end="")
print("world")
helloworld
Exception: SyntaxError: Missing parentheses in call
What if we run some code with print "hello" using Python 3?
File "examples/basics/print.py", line 1
print "hello"
^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print("hello")?
Prompting for user input in Python 2
- raw_input
- prompt
- STDIN
from __future__ import print_function
def main():
print("We have a question!")
name = raw_input('Your name: ')
print('Hello', name, ', how are you?')
print('Hello ' + name + ', how are you?')
main()
/usr/bin/python2 prompt2.py
We have a question!
Your name: Foo Bar
Hello Foo Bar , how are you?
Hello Foo Bar, how are you?
What happens if you run this with Python 3 ?
/usr/bin/python3 prompt2.py
We have a question!
Traceback (most recent call last):
File "prompt2.py", line 7, in <module>
main()
File "prompt2.py", line 4, in main
name = raw_input('Your name: ')
NameError: name 'raw_input' is not defined
Prompting for user input in Python 3
- input
- prompt
- STDIN
In Python 3 the raw_input() function was replaced by the input() function.
def main():
print("We have a question!")
name = input('Your name: ')
print('Hello ' + name + ', how are you?')
main()
What happens if you run this using Python 2 ?
/usr/bin/python2 prompt3.py
- What happens if we type in "Foo Bar"
We have a question!
Your name: Foo Bar
Your name: Traceback (most recent call last):
File "prompt3.py", line 5, in <module>
main()
File "prompt3.py", line 2, in main
name = input('Your name: ')
File "<string>", line 1
Foo Bar
^
SyntaxError: unexpected EOF while parsing
- What happens if we type in just "Foo" - no spaces:
We have a question!
Your name: Foo
Your name: Traceback (most recent call last):
File "prompt3.py", line 5, in <module>
main()
File "prompt3.py", line 2, in main
name = input('Your name: ')
File "<string>", line 1, in <module>
NameError: name 'Foo' is not defined
- The next example shows a way to exploit the
input
function in Python 2 to delete the currently running script. You know, just for fun.
We have a question!
Your name: __import__("os").unlink(__file__) or "Hudini"
Hello Hudini, how are you?
Python2 input or raw_input?
In Python 2 always use raw_input()
and never input()
.
Prompting both Python 2 and Python 3
- raw_input
- input
from __future__ import print_function
import sys
def main():
if sys.version_info.major < 3:
name = raw_input('Your name: ')
else:
name = input('Your name: ')
print('Hello ' + name + ', how are you?')
main()
Add numbers entered by the user (oups)
def main():
a = input('First number: ')
b = input('Second number: ')
print(a + b)
main()
First number: 2
Second number: 3
23
When reading from the command line using input()
, the resulting value is a string.
Even if you only typed in digits. Therefore the addition operator +
concatenates the strings.
Add numbers entered by the user (fixed)
def main():
a = input("First number: ")
b = input("Second number: ")
print(int(a) + int(b))
print(a + b)
main()
First number: 2
Second number: 3
5
In order to convert the string to numbers use the int()
or the float()
functions.
Whichever is appropriate in your situation.
Can we convert a string to int or float?
- isdigit
- int
- float
for var in ["23", "2.3", "a", "2.3.4", "2x"]:
print(var)
if var.isdigit():
print(f"{var} can be converted to int:", int(var))
if var.replace(".", "", 1).isdigit():
print(f"{var} can be converted to float:", float(var))
print('-----')
23
23 can be converted to int: 23
23 can be converted to float: 23.0
-----
2.3
2.3 can be converted to float: 2.3
-----
a
-----
2.3.4
-----
2x
-----
How can I check if a string can be converted to a number?
-
isdecimal
-
isnumeric
-
This solution only works for integers. Not for floating point numbers.
-
We'll talk about this later. For now assume that the user enters something that can be converted to a number.
-
Wrap the code in try-except block to catch any exception raised during the conversion.
-
Use Regular Expressions (regexes) to verify that the input string looks like a number.
-
isdecimal Decimal numbers (digits) (not floating point)
-
isnumeric Numeric character in the Unicode set (but not floating point number)
-
In your spare time you might want to check out the standard types of Python at stdtypes.
val = input("Type in a number: ")
print(val)
print(val.isdecimal())
print(val.isnumeric())
if val.isdecimal():
num = int(val)
print(num)
Type in a number: 42
42
True
True
42
Type in a number: 4.2
4.2
False
False
val = '11'
print(val.isdecimal()) # True
print(val.isnumeric()) # True
val = '1.1'
print(val.isdecimal()) # False
print(val.isnumeric()) # False
val = '٣' # arabic 3
print(val.isdecimal()) # True
print(val.isnumeric()) # True
print(val)
print(int(val)) # 3
val = '½' # unicode 1/2
print(val.isdecimal()) # False
print(val.isnumeric()) # True
# print(float(val)) # ValueError: could not convert string to float: '½'
val = '②' # unicode circled 2
print(val.isdecimal()) # False
print(val.isnumeric()) # True
# print(int(val)) # ValueError: invalid literal for int() with base 10: '②'
Converting string to int
- int
a = "23"
print(a) # 23
print( type(a) ) # <class 'str'>
b = int(a)
print(b) # 23
print( type(b) ) # <class 'int'>
a = "42 for life"
print(a) # 42 for life
print( type(a) ) # <class 'str'>
b = int(a)
print(b)
print( type(b) )
# Traceback (most recent call last):
# File "converting_string_to_int.py", line 5, in <module>
# b = int(a)
# ValueError: invalid literal for int() with base 10: '42 for life'
Converting float to int
a = 2.1
print( type(a) ) # <class 'float'>
print(a) # 2.1
b = int(2.1)
print( type(b) ) # <class 'int'>
print(b) # 2
a = "2.1"
print(a) # 2.1
print( type(a) ) # <class 'str'>
b = int(a)
print(b)
print( type(b) )
# Traceback (most recent call last):
# File "converting_floating_string_to_int.py", line 5, in <module>
# b = int(a)
# ValueError: invalid literal for int() with base 10: '2.1'
a = "2.1"
b = float(a)
c = int(b)
print(c) # 2
print( type(a) ) # <class 'str'>
print( type(b) ) # <class 'float'>
print( type(c) ) # <class 'int'>
d = int( float(a) )
print(d) # 2
print( type(d) ) # <class 'int'>
print( int( float(2.1) )) # 2
print( int( float("2") )) # 2
print( int( float(2) )) # 2
How can I check if a string can be converted to a number?
- int
- float
- is_int
- is_float
There is no is_int, we just need to try to convert and catch the exception, if there is one.
def is_float(val):
try:
num = float(val)
except ValueError:
return False
return True
def is_int(val):
try:
num = int(val)
except ValueError:
return False
return True
print( is_float("23") ) # True
print( is_float("23.2") ) # True
print( is_float("23x") ) # False
print( '-----' ) # -----
print( is_int("23") ) # True
print( is_int("23.2") ) # False
print( is_int("23x") ) # False
Conditionals: if
- if
def main():
expected_answer = "42"
inp = input('What is the answer? ')
if inp == expected_answer:
print("Welcome to the cabal!")
print("Still here")
print("This always happens")
main()
Conditionals: if - else
- if
- else
def main():
expected_answer = "42"
inp = input('What is the answer? ')
if inp == expected_answer:
print("Welcome to the cabal!")
else:
print("Read the Hitchhiker's guide to the galaxy!")
print("This always happens")
main()
Divide by 0
-
ZeroDivisionError
-
Another use-case for if and else:
def main():
a = input('First number: ')
b = input('Second number: ')
print("Dividing", a, "by", b)
print(int(a) / int(b))
print("Still running")
main()
First number: 3
Second number: 0
Dividing 3 by 0
Traceback (most recent call last):
File "examples/basics/divide_by_zero.py", line 9, in <module>
main()
File "examples/basics/divide_by_zero.py", line 7, in main
print(int(a) / int(b))
ZeroDivisionError: division by zero
Conditionals: if - else (other example)
- if
- else
def main():
a = input('First number: ')
b = input('Second number: ')
if int(b) == 0:
print("Cannot divide by 0")
else:
print("Dividing", a, "by", b)
print(int(a) / int(b))
print("Still running")
main()
Conditionals: else if
- else if
def main():
a = input('First number: ')
b = input('Second number: ')
if int(a) == int(b):
print('They are equal')
else:
if int(a) < int(b):
print(a + ' is smaller than ' + b)
else:
print(a + ' is bigger than ' + b)
main()
Conditionals: elif
- elif
- else if
def main():
a = input("First number: ")
b = input("Second number: ")
if int(a) == int(b):
print("They are equal")
elif int(a) < int(b):
print(f"{a} is smaller than {b}")
else:
print(f"{a} is bigger than {b}")
main()
Ternary operator (Conditional Operator)
- ?:
x = 3
answer = 'positive' if x > 0 else 'negative or zero'
print(answer) # positive
x = -3
answer = 'positive' if x > 0 else 'negative or zero'
print(answer) # negative or zero
x = 3
if x > 0:
answer = "positive"
else:
answer = "negative or zero"
print(answer) # positive
x = -3
if x > 0:
answer = "positive"
else:
answer = "negative or zero"
print(answer) # negative or zero
In other languages this is the ?: construct.
Case or Switch in Python: match pattern matching
- case
- switch
- match
import sys
if len(sys.argv) != 2:
print("Usage: python switch.py <status_code>")
sys.exit(1)
status_code = int(sys.argv[1])
match status_code:
case 100:
print("100")
case 200:
print("200")
case 200:
print("200 again")
case 401 | 302:
print("401 or 302")
case _:
print("other")
Exercise: Rectangle
- Write a script called basic2_rectangle_input.py that will ask for the sides of a rectangle and print out the area.
- Provide error messages if either of the sides is negative.
python rect.py
Side: 3
Side: 4
The area is 12
Exercise: Calculator
Create a script called basic2_calculator_input.py that accepts 2 numbers and an operator (+, -, *, /)
, and prints the result of the operation.
python calc.py
Operand: 19
Operand: 23
Operator: +
Results: 42
Exercise: Age limit
-
Create a script called basic2_age_limit_input.py
-
Ask the user what is their age.
-
If it is above 18, tell them they can legally drink alcohol.
-
If is is above 21, tell them they can also legally drink in the USA.
-
Extra:
-
Create a separate file basic2_age_limit_all_input.py
-
Ask the user for an age and a country name tell them if they can legally drink alcohol.
-
See the Legal drinking age list.
-
Don't worry if this seems to be too difficult now to solve it in a nice way.
Exercise: What is this language?
-
Create a script called basic2_language.py
-
Ask the user the name of this programing language.
-
If they type in Python, welcome them.
-
If they type in something else, correct them.
Exercise: Standard Input
-
In the previous exercises we expected the user-input to come in on the "Standard Input" aka. STDIN.
-
If you would like to practice this more, come up with other ideas, try to solve them and tell me about the task. (in person or via e-mail.)
-
(e.g. you could start building an interactive role-playing game.)
-
Name the file basic2_stdin.py
Solution: Area of rectangle
def main():
length = int(input('Length: '))
width = int(input('Width: '))
if length <= 0:
print("length is not positive")
return
if width <= 0:
print("width is not positive")
return
area = length * width
print("The area is ", area)
main()
- For historical reasons we also have the solution in Python 2
from __future__ import print_function
def main():
length = int(raw_input('Length: '))
width = int(raw_input('Width: '))
if length <= 0:
print("length is not positive")
return
if width <= 0:
print("width is not positive")
return
area = length * width
print("The area is ", area)
main()
Solution: Calculator
Here I used the format
method of the strings to insert the value of op in the {}
placeholder. We'll learn about this later on.
def main():
a = float(input("Number: "))
b = float(input("Number: "))
op = input("Operator (+-*/): ")
if op == '+':
res = a+b
elif op == '-':
res = a-b
elif op == '*':
res = a*b
elif op == '/':
res = a/b
else:
print(f"Invalid operator: '{op}'")
return
print(res)
return
main()
- For historical reasons we also have the solution in Python 2
from __future__ import print_function
a = float(raw_input("Number: "))
b = float(raw_input("Number: "))
op = raw_input("Operator (+-*/): ")
if op == '+':
res = a+b
elif op == '-':
res = a-b
elif op == '*':
res = a*b
elif op == '/':
res = a/b
else:
print("Invalid operator: '{}'".format(op))
exit()
print(res)
Solution: Calculator eval
import os
def main():
a = input("Number: ")
b = input("Number: ")
op = input("Operator (+-*/): ")
command = a + op + b
print(command)
res = eval(command)
print(res)
main()
$ python examples/basics/calculator_eval.py
Number: 2
Number: 3
Operator (+-*/): +
2+3
5
Try Again, this time:
$ python examples/basics/calculator_eval.py
Number: os.system("ls -l")
Number:
Operator (+-*/):
And then you could try it with rm -rf /
or if you are on Windows try os.system("dir")
or this: os.system("rm -f calculator_eval.py")
and on windows it would be os.system("del calculator_eval.py")
.
- Now forget this and don't use
eval
for the next few years!
Solution: Age limit
age = float(input('Please type in your age: '))
if 21 <= age:
print('You can already drink alcohol. In the USA as well.')
elif 18 <= age:
print('You can already drink alcohol. (But not in the USA.)')
else:
print('You cannot legally drink alcohol.')
Solution: What is this language?
language = input('What is the name of this programing language? ')
if language == 'Python':
print('Welcome!')
else:
print(f'No. It is not "{language}", it is Python.')
STDIN vs Command line arguments
If we run this script without any command-line parameters it will print out usage information.
If we give it two parameters it will treat the first one as the name of an input file and the second as the name of an output file.
- First try this; Then repeate. We must type in the same path again and again. Boring and error-prone.
input_file = input("Input file: ")
output_file = input("Output file: ")
print(f"This code will read {input_file}, analyze it and then create {output_file}")
...
- We could use a Tk-based dialog:
- Still boring (though maybe less error-prone)
from tkinter import filedialog
# On recent versions of Ubuntu you might need to install python3-tk in addition to python3 using
# sudo apt-get install python3-tk
input_file = filedialog.askopenfilename(filetypes=(("Excel files", "*.xlsx"), ("CSV files", "*.csv"), ("Any file", "*")))
output_file = filedialog.asksaveasfilename(filetypes=(("Excel files", "*.xlsx"), ("CSV files", "*.csv"), ("Any file", "*")))
print(f"This code will read {input_file}, analyze it and then create {output_file}")
...
- The command line has
- History!
import sys
if len(sys.argv) != 3:
exit(f"Usage: {sys.argv[0]} INPUT_FILE OUTPUT_FILE")
input_file = sys.argv[1]
output_file = sys.argv[2]
print(f"This code will read {input_file}, analyze it and then create {output_file}")
...
Command line arguments
- sys
- argv
import sys
def main():
print(sys.argv)
print(sys.argv[0])
print(sys.argv[1])
print(sys.argv[2])
main()
$ python examples/basic/cli.py one two
['examples/basics/cli.py', 'one', 'two']
examples/basics/cli.py
one
two
$ python examples/basic/cli.py
['examples/basics/cli.py']
examples/basics/cli.py
Traceback (most recent call last):
File "examples/basics/cli.py", line 6, in <module>
print(sys.argv[1])
IndexError: list index out of range
Command line arguments - len
- len
import sys
def main():
print(sys.argv)
print(len(sys.argv))
main()
Command line arguments - exit
- exit
- !=
import sys
def main():
if len(sys.argv) != 2:
exit("Usage: " + sys.argv[0] + " VALUE")
print("Hello " + sys.argv[1])
main()
echo %errorlevel%
echo $?
Exercise: Rectangle (argv)
- Create a script called basic2_rectangle_argv.py
- Change the above script that it will accept the arguments on the command line like this:
python basic2_rectangle_argv.py 2 4
Exercise: Calculator (argv)
- Create a script called basic2_calculator_argv.py that accepts 2 numbers and an operator
(+, -, *, /)
, on the command line and prints the result of the operation. python basic2_calculator_argv.py 2 + 3
python basic2_calculator_argv.py 6 / 2
python basic2_calculator_argv.py 6 * 2
Solution: Area of rectangle (argv)
import sys
def main():
if len(sys.argv) != 3:
exit("Needs 2 arguments: width length")
width = int( sys.argv[1] )
length = int( sys.argv[2] )
if length <= 0:
exit("length is not positive")
if width <= 0:
exit("width is not positive")
area = length * width
print("The area is ", area)
main()
Solution: Calculator (argv)
import sys
def main():
if len(sys.argv) < 4:
exit("Usage: " + sys.argv[0] + " OPERAND OPERATOR OPERAND")
a = float(sys.argv[1])
b = float(sys.argv[3])
op = sys.argv[2]
if op == '+':
res = a + b
elif op == '-':
res = a - b
elif op == '*':
res = a * b
elif op == '/':
res = a / b
else:
print("Invalid operator: '{}'".format(op))
exit()
print(res)
main()
The multiplication probably won't work because the Unix/Linux shell replaces the * by the list of files in your current directory and thus the python script will see a list of files instead of the *
.
This is not your fault as a programmer. It is a user error. The correct way to run the script is python calc.py 2 '*' 3
.
Solution: Calculator eval
import sys
def main():
if len(sys.argv) != 4:
exit(f"Usage: {sys.argv[0]} NUMBER OPERATOR NUMBER")
command = sys.argv[1] + sys.argv[2] + sys.argv[3]
print(command)
res = eval(command)
print(res)
main()
$ python examples/basics/calculator_argv_eval.py 2 + 3
5
$ python examples/basics/calculator_argv_eval.py 2 '*' 3
6
- Now forget this and don't use
eval
for the next few years!
Compilation vs. Interpretation
Compiled
- Languages: C, C++
- Development cycle: Edit, Compile (link), Run.
- Strong syntax checking during compilation and linking.
- Result: Stand-alone executable code.
- Need to compile to each platform separately. (Windows, Linux, Mac, 32bit vs 64bit).
Interpreted
- Shell, BASIC
- Development cycle: Edit, Run.
- Syntax check only during run-time.
- Result: we distribute the source code.
- Needs the right version of the interpreted on every target machine.
Both?
- Java (running on JVM - Java Virtual Machine)
- C# (running on CLR - Common Language Runtime)
Is Python compiled or interpreted?
There are syntax errors that will prevent your Python code from running
x = 2
print(x)
if x > 3
File "examples/other/syntax_error.py", line 4
if x > 3
^
SyntaxError: invalid syntax
There are other syntax-like errors that will be only caught during execution
x = 2
print(x)
print(y)
y = 13
print(42)
2
Traceback (most recent call last):
File "compile.py", line 5, in <module>
print y
NameError: name 'y' is not defined
def f():
global y
y = "hello y"
print("in f")
x = 2
print(x)
f()
print(y)
y = 13
print(42)
2
in f
hello y
42
- Python code is first compiled to bytecode and then interpreted.
- CPython is both the compiler and the interpreter.
- Jython and IronPython are mostly just compiler to JVM and CLR respectively.
Flake8 checking
pip install flake8
flake8 --ignore= compile.py
compile.py:3:7: F821 undefined name 'y'
compile.py:6:1: W391 blank line at end of file
If you used Anaconda you can install with:
conda install flake8
Pylint checking
pip install pylint
len = 42
print(len)
pylint bad.py
************* Module bad
bad.py:1:0: C0114: Missing module docstring (missing-module-docstring)
bad.py:2:0: W0622: Redefining built-in 'len' (redefined-builtin)
bad.py:2:0: C0103: Constant name "len" doesn't conform to UPPER_CASE naming style (invalid-name)
--------------------------------------------------------------------
Your code has been rated at -5.00/10 (previous run: -5.00/10, +0.00)
Numbers
Numbers
a = 42 # decimal
h = 0xA3C # 2620 - hex - staring with 0x
o = 0o171 # 121 - octal - starting with 0o
# 011 works in Python 2.x but not in Python 3.x
# requires the o that works in
# (recent versions of) Python 2.x
b = 0b101 # 5 - binary numbers - starting with 0b
r = 2.3
print(a) # 42
print(h) # 2620
print(o) # 121
print(b) # 5
print(r) # 2.3
In Python numbers are stored as decimals, but in the source code you can also use hexadecimal, octal, or binary notations. This is especially useful if the domain you are programming in is using those kinds of numbers. For example hardware engineers often talk in hexadecimal values. In that case you won't need to constantly translate between the form used in the current domain and decimal numbers.
Operators for Numbers
- +=
- -=
- ++
- --
- %
- /
- //
a = 2
b = 3
c = 2.3
d = a + b
print(d) # 5
print(a + b) # 5
print(a + c) # 4.3
print(b / a) # 1.5 # see the __future__
print(b // a) # 1 # floor division
print(a * c) # 4.6
print(a ** b) # 8 (power)
print(17 % 3) # 2 (modulus)
a += 7 # is the same as a = a + 7
print(a) # 9
# a++ # SyntaxError: invalid syntax
# a-- # SyntaxError: invalid syntax
a += 1
print(a) # 10
a -= 1
print(a) # 9
There is no autoincrement (++) and autodecrement (--) in Python, because they can be expressed by += 1 and -= 1 respectively.
Integer division and the future
- future
from __future__ import print_function
print(3/2)
$ python divide.py
1
$ python3 divide.py
1.5
from __future__ import print_function
from __future__ import division
print(3/2) # 1.5
If you need to use Python 2, remember that by default division is integer based so 3/2 would return 1. Importing the 'division' directive from future changes this to the behavior that we usually expect 3/2 being 1.5. This is also the behavior we have in Python 3. In case you already use Python 3 and would like to get the "old" behavior, that is to get the integer part of the division, you can always call the "int" function: int(b/a).
Pseudo Random Number (uniform distribution)
- random
import random
a = random.random()
print(a) # 0.5648261676148922 a value between 0.0 <= < 1.0
print(random.random())
print(random.random())
- random
- Pseudo random generator
- Uniform distribution between 0-1
- For cryptographically strong random numbers use the secrets module.
Fixed random numbers
- random
- seed
import random
random.seed(37)
print(random.random()) # 0.6820045605879779
print(random.random()) # 0.09160260807956389
print(random.random()) # 0.6178163488614024
Rolling dice - randrange
- randrange
import random
print( 1 + int( 6 * random.random() ))
print(random.randrange(1, 7))
# One of the following: 1, 2, 3, 4, 5, 6
Random choice
- choice
import random
letters = "abcdefghijklmno"
print(random.choice(letters)) # pick one of the letters
fruits = ["Apple", "Banana", "Peach", "Orange", "Durian", "Papaya"]
print(random.choice(fruits))
# pick one of the fruits
built-in method
- A common mistake. Not calling the method.
import random
rnd = random.random
print(rnd) # <built-in method random of Random object at 0x124b508>
y = rnd()
print(y) # 0.7740737563564781
print(random.random) # <built-in method random of Random object at 0x124b508>
x = rnd
print(x) # <built-in method random of Random object at 0x124b508>
print(x()) # 0.5598791496813703
When you see a string like the above "built-in method ..." you can be almost certainly sure that you have forgotten the parentheses at the end of a method call.
Exception: TypeError: 'module' object is not callable
- A common mistake. Calling the class and not the method.
import random
print("hello")
x = random()
print(x)
hello
Traceback (most recent call last):
File "examples/numbers/rnd.py", line 3, in <module>
x = random()
TypeError: 'module' object is not callable
Fixing the previous code
import random
x = random.random()
print(x)
from random import random
x = random()
print(x)
Exception: AttributeError: module 'random' has no attribute
- A common mistake. Using the wrong filename.
This works fine:
print("Hello World")
This gives an error
import random
print(random.random())
Traceback (most recent call last):
File "rnd.py", line 2, in <module>
print(random.random())
AttributeError: module 'random' has no attribute 'random'
Make sure the names of your files are not the same as the names of any of the python packages.
Exercise: Number guessing game - level 0
Level 0
- Create a file called number_guessing_game_0.py
- Using the random module the computer "thinks" about a whole number between 1 and 20.
- The user has to guess the number. After the user types in the guess the computer tells if this was bigger or smaller than the number it generated, or if it was the same.
- The game ends after just one guess.
Level 1-
- Other levels in the next chapter.
Exercise: Fruit salad
-
Write a script called fruit_salad.py based on the following skeleton, that will pick 3 fruits from a list of fruits like the one we had in one of the earlier slides. Print the 3 names.
-
Could you make sure the 3 fruits are different?
-
Use the following skeleton:
fruits = ["Apple", "Banana", "Peach", "Orange", "Durian", "Papaya"]
Solution: Number guessing game - level 0
import random
hidden = random.randrange(1, 21)
print("The hidden values is", hidden)
user_input = input("Please enter your guess: ")
print(user_input)
guess = int(user_input)
if guess == hidden:
print("Hit!")
elif guess < hidden:
print("Your guess is too low")
else:
print("Your guess is too high")
Solution: Fruit salad
- random
- sample
import random
fruits = ["Apple", "Banana", "Peach", "Orange", "Durian", "Papaya"]
salad = random.sample(fruits, 3)
print(salad)
Comparison and Boolean
if statement again
- if
- ==
x = 2
if x == 2:
print("it is 2")
else:
print("it is NOT 2")
if x == 3:
print("it is 3")
else:
print("it is NOT 3")
# it is 2
# it is NOT 3
Comparison operators
- ==
- !=
- <
- <=
-
=
-
== equal
!= not equal
< less than
<= less than or equal
> greater than
>= greater than or equal
Compare numbers, compare strings
x = 2
y = 3
if x < y:
print("x is less than y")
# x is less than y
x = "Snake"
y = "Stake"
if x < y:
print("x is less than y")
# x is less than y
z = "מלון"
q = "בלון"
if z < q:
print(f"{z} in z is less than {q}")
else:
print(f"{q} in q is less than {z}")
print(x < z)
x = "👸"
y = "💂"
if x < y:
print(f"{x} in x is less than {y}")
else:
print(f"{y} in y is less than {x}")
print(1 < 2) # True
print("1" < "2") # True
print(2 < 11) # True
print("2" < "11") # False
Do NOT Compare different types!
x = 12
y = 3
result = "Yes" if x > y else "No"
print(result) # Yes
x = "12"
y = "3"
print("Yes" if x > y else "No") # No
x = "12"
y = 3
print("Yes" if x > y else "No") # Yes
x = 12
y = "3"
print("Yes" if x > y else "No") # No
In Python 2
please be careful and only compare the same types.
Otherwise the result will look strange.
Yes
No
Yes
No
In Python 3
, comparing different types raises exception:
Yes
No
Traceback (most recent call last):
File "examples/other/compare.py", line 11, in <module>
print("Yes" if x > y else "No") # Yes
TypeError: '>' not supported between instances of 'str' and 'int'
Complex if statement with boolean operators
- Boolean operators or Logical operators
- and
- or
- not
age = 16
name = "Foo"
if 0 < age and age <= 18:
print("age is bewteen 0 and 18")
else:
print("age is NOT between 0 and 18")
if age < 18 or 65 < age:
print("Young or old")
else:
print("Working age")
if age < 18 and not name == "Foo":
print("True")
else:
print("False")
Chained expressions
age = 16
name = "Foo"
if 0 < age and age <= 18:
print("age is bewteen 0 and 18")
else:
print("age is NOT between 0 and 18")
if 0 < age <= 18:
print("age is bewteen 0 and 18")
else:
print("age is NOT between 0 and 18")
Boolean operators
-
and
-
or
-
not
-
and
-
or
-
not
if COND:
do something
else:
do something other
if not COND:
do something other
if COND1 and COND2:
do something
if COND1 or COND2:
do something
if COND1 and not COND2:
do something
Boolean truth tables
COND1 and COND2 Result
True True True
True False False
False True False
False False False
COND1 or COND2 Result
True True True
True False True
False True True
False False False
not COND Result
True False
False True
Boolean values: True and False
- True
- False
In this chapter we are going to talk about boolean values and operations on boolean values.
Unlike in some other languages Python actually has 2 special symbols to represent True and False.
(In those languages 0 usually represents False and 1 represents True.)
- True
- False
Using True and False in variables
x = True
y = False
if x:
print("X is True")
else:
print("X is False")
if y:
print("Y is True")
else:
print("Y is False")
# X is True
# Y is False
Comparison returns True or False
a = "42"
b = 42
print(a) # 42
print(b) # 42
print(a == b) # False
print(a != b) # True
print(b == 42.0) # True
print(None == None) # True
print(None == False) # False
Assign comparisons to variables
-
True and False are real boolean values.
-
False
-
True
x = 2
v = x == 2
print(v)
if v:
print(v, "is true - who would thought? ")
v = x == 3
print(v)
if v:
print(v, "is true - who would thought? ")
else:
print(v, "is false - who would thought? ")
# True
# True is true - who would thought?
# False
# False is false - who would thought?
Flag
correct = False
name = input("The name of this language: ")
if name == "Python":
correct = True
if correct:
print("The input was correct")
Use flag to skip first few lines
We have a series of rows that we might read from a file and would like to process the sections of rows that start with a well-defined row. Unfortunately the file does not always start with a row that matches the definition. In some cases there are a few lines at the beginning of the file that we need to throw away before we can start our processing.
In this exacmple we use series of numbers to represent the rows of that file and the "well defined condtion to start the series is a number being "big".
We can use a variable as a "flag" to indicate if we are still before the first good section or if the sections already started.
def print_series(series):
#started = False;
for val in series:
if val > 10:
# started = True
print("start new series")
print(val)
#if not started:
# continue
if val <= 10:
print(val)
print_series([20, 2, 3, 30, 1, 7])
print()
print_series([1, 4, 20, 2, 3, 30, 1, 7])
Toggle
- not
machine_is_on = False
print(machine_is_on) # False
# Instead of this:
if machine_is_on:
machine_is_on = False
else:
machine_is_on = True
# Write this:
machine_is_on = not machine_is_on
print(machine_is_on) # True
machine_is_on = not machine_is_on
print(machine_is_on) # False
Short circuit
def check_money():
return money > 1000000
def check_salary():
salary += 1
return salary >= 1000
while True:
if check_money() or check_salary():
print("I can live well")
Short circuit fixed
def check_money():
return money > 1000000
def check_salary():
salary += 1
return salary >= 1000
while True:
has_good_money = check_money()
has_good_salary = check_salary()
if has_good_money or has_good_salary:
print("I can live well")
Does this value count as True or False?
x = 23
if x:
print("23 is true")
if x != 0:
print("23 is true")
y = 0
if y:
print("0 is true")
else:
print("0 is false")
if y != 0:
print("0 is true")
else:
print("0 is false")
# 23 is true
# 0 is false
True and False values in Python
- None
- 0
- "" (empty string)
- False
- []
- {}
- ()
Everything else is true.
values = [None, 0, "", False, [], (), {}, "0", True]
for v in values:
if v:
print("True value: ", v)
else:
print("False value: ", v)
# False value: None
# False value: 0
# False value:
# False value: False
# False value: []
# False value: ()
# False value: {}
# True value: 0
# True value: True
None
is like undef or Null or Nill in other languages.
Incorrect use of conditions
In your normal speech you could probably say something like "If status_code is 401 or 302, do something.". Meaning status_cone can be either 401 or 302.
If you tried to translate this into code directly you would write something like this:
if status_code == 401 or 302:
pass
Python treats it as if we wrote:
if (status_code == 401) or 302:
pass
However, this is incorrect. This condition will always be true as this is actually same as if you wrote:
if (status_code == 401) or (302)
so it will compare status_code to 401, and it will separately check if
302 is True, but any number different from 0 is considered to be True so the above expression will always be True.
What you probably meant is this:
if status_code == 401 or status_code == 302:
pass
Alternative way:
An alternative way to achieve the same results would be though probably at this point we have not learned the "in" operator, nor lists (comma separated values in square brackets):
if status_code in [401, 302]:
pass
Exercise: compare numbers
- Create a file called bool_compare_numbers.py
- Ask the user to enter two numbers and tell us which one is bigger.
Exercise: compare strings
- Create a file called bool_compare_strings.py
- You can use the
len()
function to get the length of the string. - Ask the user to enter two strings
- Then ask the user to select if she wants to compare them based on Unicode or based on their length
- Then tell us which one is bigger.
Input a string: (user types string and ENTER)
Input another string: (user types string and ENTER)
How to compare:
1) Unicode
2) Length
(user types 1 or 2 and ENTER)
Solution: compare numbers
a_in = input("Please type in a whole number: ")
b_in = input("Please type in another whole number: ")
if not a_in.isdecimal():
exit("First input was not a whole number")
if not b_in.isdecimal():
exit("Second input was not a whole number")
a_num = float(a_in)
b_num = float(b_in)
if a_num > b_num:
print("First number is bigger")
elif a_num < b_num:
print("First number is smaller")
else:
print("They are equal")
Solution: compare strings
a_in = input("Please type in a string: ")
b_in = input("Please type in another string: ")
print("How to compare:")
print("1) ASCII")
print("2) Length")
how = input()
if how == '1':
first_is_bigger = a_in > b_in
second_is_bigger = a_in < b_in
elif how == '2':
first_is_bigger = len(a_in) > len(b_in)
second_is_bigger = len(a_in) < len(b_in)
if first_is_bigger:
print("First number is bigger")
elif second_is_bigger:
print("First number is smaller")
else:
print("They are equal")
Strings
Single quoted and double quoted strings
In Python, just as in most of the programming languages you must put any free text inside a pair of quote characters. Otherwise Python will try to find meaning in the text.
These pieces of texts are called "strings".
In Python you can put string between two single quotes: '' or between two double quotes: "". Which one, does not matter.
soup = "Spiced carrot & lentil soup"
salad = 'Ceasar salad'
print(soup)
print(salad)
Spiced carrot & lentil soup
Ceasar salad
Long lines
text = "abc" "def"
print(text)
other = "abcdef"
print(other)
long_string = "one" "two" "three"
print(long_string)
short_rows = "one" \
"two" \
"three"
print(short_rows)
long_string = "first row second row third row"
print(long_string)
shorter = "first row \
second row \
third row"
print(shorter)
abcdef
abcdef
onetwothree
onetwothree
first row second row third row
first row second row third row
Multiline strings
- We would like to print the number one under the other
text = "Joe: 23\nJane: 7 \nJacqueline 19\n"
print(text)
Joe: 23
Jane: 7
Jacqueline 19
Triple quoted strings (multiline)
- """
- '''
If you would like to create a string that spreads on multiple lines, there is a possibility to put the text between 3 quotes on both sides. Either 23 single-quotes or 23 double-quotes.
text = """
Joe: 23
Jane: 7
Jacqueline 19
"""
print(text)
Joe: 23
Jane: 7
Jacqueline 19
Can spread multiple lines.
first row
second row
third row
Triple quoted comments - documentation
"""
Documentation of the module
"""
def some_funcion():
"Documentation of the function"
pass
text = """first row
second row
third row"""
"a string"
"""another
longer
string with code:
print("this is not printed")
"""
print("Hello World")
String length (len)
- len
The len
function returns the length of the string in number of characters.
line = "Hello World"
hw = len(line)
print(hw) # 11
text = """Hello
World"""
print(len(text)) # 12
String repetition and concatenation
You might be used to the fact that you can only multiply numbers, but in python you can also "multiply" a string by a number. It is called repetition. In this example we have a string "Jar " that we repeat twice.
We can also add two strings to concatenate them together.
I don't think the repetition operator is used very often, but in one case it could come in very handy. When you are writing some text report and you'd like to add a long line of dashes that would be exactly the same length as your title.
name = 2 * 'Jar '
print(name) # Jar Jar
full_name = name + 'Binks'
print(full_name) # Jar Jar Binks
title = "We have some title"
print(title)
print('-' * len(title))
# We have some title
# ------------------
A character in a string
- []
text = "Hello World"
a = text[0]
print(a) # H
b = text[6]
print(b) # W
String slice (instead of substr)
- slice
- substr
- [:]
- :
text = "Hello World"
b = text[1:4]
print(b) # ell
print(text[2:]) # llo World
print(text[:2]) # He
start = 1
end = 4
print(text[start:end]) # ell
Change a string
- immutable
In Python strings are "immutable", meaning you cannot change them. You can replace a whole string in a variable, but you cannot change it.
In the following example we wanted to replace the 3rd character (index 2), and put "Y" in place. This raised an exception
text = "abcd"
print(text) # abcd
text[2] = 'Y'
print("done")
print(text)
abcd
Traceback (most recent call last):
File "string_change.py", line 4, in <module>
text[2] = 'Y'
TypeError: 'str' object does not support item assignment
Replace part of a string
- Strings in Python are immutable - they never change.
How to change a string
text = "abcd"
print(text) # abcd
text = text[:2] + 'Y' + text[3:]
print(text) # abYd
String copy
text = "abcd"
print(text) # abcd
text = text + "ef"
print(text) # abcdef
other = text
print(other) # abcdef
text = "xyz"
print(text) # xyz
print(other) # abcdef
When assigning a variable pointing to a string, the new variable is pointing to the same string.. If we then assign some other string to either of the variables, then they will point to two different strings.
String functions and methods (len, upper, lower)
- len
- upper
- lower
a = "xYz"
print(len(a)) # 3
b = a.upper()
print(b) # XYZ
print(a) # xYz - immutable!
print(a.lower()) # xyz
- Type dir("") in the REPL to get the list of string methods.
- List of built-in functions.
- List of string methods.
index in string
- index
- ValueError
text = "The black cat climbed the green tree."
print(text.index("bl")) # 4
print(text.index("The")) # 0
print(text.index("the")) # 22
print(text.index("dog"))
4
0
22
Traceback (most recent call last):
File "index.py", line 5, in <module>
print(text.index("dog"))
ValueError: substring not found
index in string with range
- index
text = "The black cat climbed the green tree."
print(text.index("c")) # 7
print(text.index("c", 8)) # 10
print(text.index("gr", 8)) # 26
print(text.index("gr", 8, 16))
7
10
26
Traceback (most recent call last):
File "examples/strings/index2.py", line 8, in <module>
print a.index("gr", 8, 16)
ValueError: substring not found
Find all in the string
Later, when we learned loops.
rindex in string with range
- rindex
text = "The black cat climbed the green tree."
print(text.rindex("c")) # 14
print(text.rindex("c", 8)) # 14
print(text.rindex("c", 8, 13)) # 10
print(text.rindex("gr", 8)) # 26
print(text.rindex("gr", 8, 16))
14
14
10
26
Traceback (most recent call last):
File "examples/strings/rindex.py", line 10, in <module>
print(a.rindex("gr", 8, 16))
ValueError: substring not found
find in string
- find
- rfind
Alternatively use find and rfind that will return -1 instead of raising an exception.
text = "The black cat climbed the green tree."
print(text.find("bl")) # 4
print(text.find("The")) # 0
print(text.find("dog")) # -1
print(text.find("c")) # 7
print(text.find("c", 8)) # 10
print(text.find("gr", 8)) # 26
print(text.find("gr", 8, 16)) # -1
print(text.rfind("c", 8)) # 14
in string
- in
Check if a substring is in the string?
txt = "hello world"
if "wo" in txt:
print('found wo')
if "x" in txt:
print("found x")
else:
print("NOT found x")
found wo
NOT found x
index if in string
- index
- in
sub = "cat"
txt = "The black cat climbed the green tree"
if sub in txt:
loc = txt.index(sub)
print(sub + " is at " + str(loc))
sub = "dog"
if sub in txt:
loc = txt.index(sub)
print(sub + " is at " + str(loc))
# cat is at 10
Encodings: ASCII, Windows-1255, Unicode
raw strings
- r
# file_a = "c:\Users\Foobar\readme.txt"
# print(file_a)
# Python2: eadme.txtFoobar
# Python3:
# File "examples/strings/raw.py", line 6
# file_a = "c:\Users\Foobar\readme.txt"
# ^
# SyntaxError: (unicode error) 'unicodeescape' codec
# can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
file_b = "c:\\Users\\Foobar\\readme.txt"
print(file_b) # c:\Users\Foobar\readme.txt
file_c = r"c:\Users\Foobar\readme.txt"
print(file_c) # c:\Users\Foobar\readme.txt
text = r"text \n \d \s \ and more"
print(text) # text \n \d \s \ and more
Escape sequences are kept intact and not escaped. Used in regexes.
ord
-
ord
print( ord('a') ) # 97
print( ord('=') ) # 61
print( ord('\r') ) # 13
print( ord('\n') ) # 10
print( ord(' ') ) # 32
print( ord('á') ) # 225 Hungraian
print( ord('ó') ) # 243
print( ord('א') ) # 1488 Hebrew alef
print( ord('أ') ) # 1571 Arabic/Farsi
print( ord('α') ) # 945 Greek
print( ord('ㅏ') ) # 12623 Korean
print( ord('😈') ) # 128520
chr - number to character
-
chr
print( chr(33) ) # !
print( chr(48) ) # 0
print( chr(65) ) # A
print( chr(225) ) # á Hungraian
print( chr(243) ) # ó Hungraian
print( chr(1489) ) # ב Hebrew bet
print( chr(1572) ) # ؤ Arabic/Farsi
print( chr(945) ) # α Greek
print( chr(959) ) # ο Greek omicron
print( chr(937) ) # Ω Greek omega
print( chr(931) ) # Σ Greek sigma
print( chr(4632) ) # መ Amharic
print( chr(12624) ) # ㅐ Korean
print( chr(128519) ) # 😇
print( chr(128520) ) # 😈
- Hebrew alphabet
- Arabic alphabet
- Greek alhabet
- Korean alphabet - Hangul
- Amharic
- Klingon script - proposal - no official support
Exercise: one string in another string
- Write script called string_in_another_string.py that accepts two strings and tells if one of them can be found in the other and where?
Exercise: Character to Unicode-8 - CLI
Write script called char_to_unicode.py that gets a character on the command line and prints out the Unicode code of it.
Maybe even:
Write script that gets a string on the command line and prints out the Unicode code of each character.
Exercise: from Unicode to character - CLI
Write script called unicode_to_char.py that accepts a number on the command line and prints the character represented by that number.
Exercise: ROT13
-
rot13
-
Implement ROT13:
-
Create a script called rot13.py that given a string on the command line will print the ROT13 version of the string.
-
It should work like this:
$ python rot13.py "Hello World!"
Uryyb Jbeyq!
$ python rot13.py "Uryyb Jbeyq!"
Hello World!
Solution: one string in another string
import sys
if len(sys.argv) != 3:
exit(f"Usage: {sys.argv[0]} short-STRING long-STRING")
string = sys.argv[1]
text = sys.argv[2]
if string in text:
loc = text.index(string)
print(string, "can be found in ", text, "at", loc)
else:
print(string, "can NOT be found in ", text)
Solution: compare strings
mode = input("Mode of comparision: [length|ascii]")
if mode != "length" and mode != "ascii":
print("Not good")
exit()
str1 = input("String 1:")
str2 = input("String 2:")
if mode == "length":
if len(str1) > len(str2):
print("First is longer")
elif len(str1) < len(str2):
print("Second is longer")
else:
print("They are of equal length")
elif mode == "ascii":
if str1 > str2:
print("First is later in the ABC order")
elif str1 < str2:
print("Second is later in the ABC order")
else:
print("The strings are equal")
Solution: to Unicode CLI
import sys
if len(sys.argv) != 2:
exit(f"Usage: {sys.argv[0]} CHARACTER")
print( ord( sys.argv[1]) )
import sys
if len(sys.argv) != 2:
exit(f"Usage: {sys.argv[0]} STRING")
for cr in sys.argv[1]:
print( ord( cr ) )
Solution: from Unicode CLI
import sys
if len(sys.argv) != 2:
exit(f"Usage: {sys.argv[0]} NUMBER")
print( chr( int(sys.argv[1]) ) )
Solution: Show characters based on Unicode code-points
import sys
if len(sys.argv) != 3:
exit(f"Usage: {sys.argv[0]} START END")
start, end = sys.argv[1:]
for decimal in range(int(start), int(end)+1):
print(f"{decimal} {chr(decimal)}")
# Emojis:
# 127744 -
# 128506 - 128591
Solution: ROT13
- rot13
- codecs
- encoding
import sys
if len(sys.argv) != 2:
exit(f"Usage: {sys.argv[0]} TEXT")
original = sys.argv[1]
encoded = ''
for char in original:
code = ord(char)
if 'a' <= char <= 'z':
#if ord('a') <= code and code <= ord('z'):
new_char = chr((code-ord('a') + 13 ) % 26 + ord('a'))
elif 'A' <= char <= 'Z':
new_char = chr((code-65 + 13 ) % 26 + 65)
else:
new_char = char
encoded += new_char
print(encoded)
Of course instead of implementing all the calculations by yourself you can also rely on a module that comes with Python:
import sys
import codecs
if len(sys.argv) != 2:
exit(f"Usage: {sys.argv[0]} TEXT")
original = sys.argv[1]
encoded = codecs.encode(original, encoding='rot_13')
print(encoded)
Infinite loop
i = 0
while True:
i += 1
print(i)
print("done")
break
- break
i = 0
while True:
print(i)
i += 1
if i >= 7:
break
print("done")
0
1
2
3
4
5
6
done
continue
- continue
i = 0
while True:
i += 1
if i > 3 and i < 8:
continue
if i > 10:
break
print(i)
1
2
3
8
9
10
While with many conditions
while (not found_error) and (not found_warning) and (not found_exit):
do_the_real_stuff()
while True:
line = get_next_line()
if found_error:
break
if found_warning:
break
if found_exit:
break
do_the_real_stuff()
while loop with many conditions
while True:
line = get_next_line()
if last_line:
break
if line is empty:
continue
if line_has_a_hash: # at the beginning:
continue
if line_has_two_slashes: // at the beginning:
continue
do_the_real_stuff()
ord in a file
- ord
import sys
filename = sys.argv[1]
with open(filename) as fh:
content = fh.read()
for c in content:
print(ord(c))
Strings as Comments
- '''
marks single line comments.
There are no real multi-line comments in Python, but we can use triple-quots to create multi-line strings and if they are not part of another statement, they will be disregarded by the Python interpreter. Effectively creating multi-line comments.
print("hello")
'A string which is disregarded'
print(42)
'''
Using three single-quotes on both ends (a triple-quoted string)
can be used as a multi-line comment.
'''
print("world")
Loops
Loops: for-in and while
- for in - to iterate over a well defined list of values. (characters, range of numbers, shopping list, etc.)
- while - repeat an action till some condition is met. (or stopped being met)
for-in loop on strings
- for
txt = 'hello world'
for ch in txt:
print(ch)
h
e
l
l
o
w
o
r
l
d
for-in loop on list
- for
fruits = ["Apple", "Banana", "Peach", "Orange", "Durian", "Papaya"]
for fruit in fruits:
print(fruit)
Apple
Banana
Peach
Orange
Durian
Papaya
for-in loop on range
- range
for ix in range(3, 7):
print(ix)
3
4
5
6
Iterable, iterator
for in loop with early end using break
- break
txt = 'hello world'
for ch in txt:
if ch == ' ':
break
print(ch)
print("Here")
h
e
l
l
o
Here
for in loop skipping parts using continue
- continue
txt = 'hello world'
for ch in txt:
if ch == ' ':
continue
print(ch)
print("done")
h
e
l
l
o
w
o
r
l
d
done
for in loop with break and continue
txt = 'hello world'
for cr in txt:
if cr == ' ':
continue
if cr == 'r':
break
print(cr)
print('done')
h
e
l
l
o
w
o
done
while loop
- while
import random
total = 0
while total <= 100:
print(total)
total += random.randrange(20)
print("done")
0
10
22
29
45
54
66
71
77
82
93
done
Infinite while loop
- while
import random
total = 0
while total >= 0:
print(total)
total += random.randrange(20)
print("done")
...
1304774
1304779
1304797
^C1304803
Traceback (most recent call last):
File "while_infinite.py", line 5, in <module>
print(total)
KeyboardInterrupt
- Don't do this!
- Make sure there is a proper end-condition. (exit-condition)
- Use Ctrl-C to stop it
While with complex expression
import random
def random_loop():
total = 0
while (total < 10000000) and (total % 17 != 1) and (total ** 2 % 23 != 7):
print(total)
total += random.randrange(20)
# do the real work here
print("done")
if __name__ == '__main__':
random_loop()
0
12
25
26
34
50
65
77
done
While with break
import random
def random_loop():
total = 0
while total < 10000000:
if total % 17 == 1:
break
if total ** 2 % 23 == 7:
break
print(total)
total += random.randrange(20)
# do the real work here
print("done")
if __name__ == '__main__':
random_loop()
0
12
25
26
34
50
65
77
done
While True
import random
def random_loop():
total = 0
while True:
if total >= 10000000:
break
if total % 17 == 1:
break
if total ** 2 % 23 == 7:
break
print(total)
total += random.randrange(20)
# do the real work here
print("done")
if __name__ == '__main__':
random_loop()
0
12
25
26
34
50
65
77
done
Testing the refactoring of the while loop
import while_break
import while_complex_condition
import while_true
import random
import pytest
@pytest.mark.parametrize('seed', [0, 7, 9, 21])
def test_random_loop(capsys, seed):
random.seed(seed)
while_complex_condition.random_loop()
out_complex, _ = capsys.readouterr()
random.seed(seed)
while_break.random_loop()
out_break, _ = capsys.readouterr()
assert out_complex == out_break
random.seed(seed)
while_true.random_loop()
out_true, _ = capsys.readouterr()
assert out_complex == out_true
print(out_true)
def test_newest_random_loop_0(capsys):
expected = """0
12
25
26
34
50
65
77
done
"""
random.seed(0)
while_true.random_loop()
out_true, _ = capsys.readouterr()
assert out_true == expected
def test_newest_random_loop_7(capsys):
expected = """0
10
14
26
27
29
46
49
60
78
79
95
101
102
104
117
130
132
139
141
158
done
"""
random.seed(7)
while_true.random_loop()
out_true, _ = capsys.readouterr()
assert out_true == expected
pytest test_random_loop.py
pytest -s test_random_loop.py
Duplicate input call
- Ask the user what is their ID number.
- Check if it is a valid ID number. (To make our code more simple we only check the length of the string.)
- Ask again if it was not a valid number.
id_str = input("Type in your ID: ")
if len(id_str) != 9:
id_str = input("Type in your ID")
print("Your ID is " + id_str)
- Realize, that if the user types in an incorrect string for the 2nd time, our code does not check it.
Duplicate input call with loop
- A
while
loop would be a better solution. - This works, but now we have duplicated the
input
call and the text is different in the two cases. DRY - We can't remove the first call of
input
as we need theid_str
variable in the condition of thewhile
already.
id_str = input("Type in your ID: ")
while len(id_str) != 9:
id_str = input("Type in your ID")
print("Your ID is " + id_str)
Eliminate duplicate input call
- We can use the
while True
construct to avoid this duplication.
while True:
id_str = input("Type in your ID: ")
if len(id_str) == 9:
break
print("Your ID is " + id_str)
do while loop
-
do while
-
There is no
do ... while
in Python but we can write code like this to have similar effect.
while True:
answer = input("What is the meaning of life? ")
if answer == '42':
print("Yeeah, that's it!")
break
print("done")
while with many continue calls
- continue
while True:
line = get_next_line()
if last_line:
break
if line_is_empty:
continue
if line_has_an_hash_at_the_beginning: # #
continue
if line_has_two_slashes_at_the_beginning: # //
continue
do_the_real_stuff
Break out from multi-level loops
Not supported in Python. "If you feel the urge to do that, your code is probably too complex. Create functions!"
while external():
while internal():
if ...:
break
if ...:
continue
For-else
The else
part will be executed if the loop finished all the iterations without calling break
.
found_number_bigger_than_10 = False
numbers = [2, 3, 4]
for num in numbers:
if num > 10:
found_number_bigger_than_10 = True
break
print(num)
if found_number_bigger_than_10:
print("found number bigger than 10")
print('---------------------')
found_number_bigger_than_10 = False
numbers = [2, 3, 12, 4]
for num in numbers:
if num > 10:
found_number_bigger_than_10 = True
break
print(num)
if found_number_bigger_than_10:
print("found number bigger than 10")
print('---------------------')
for num in [2, 3, 4]:
if num > 10:
break
print(num)
else:
print("in else - finished without calling break")
print("not found number bigger than 10")
print('---------------------')
for num in [2, 3, 12, 4]:
if num > 10:
break
print(num)
else:
print("in else - finished after calling break")
print("not found number bigger than 10")
2
3
4
---------------------
2
3
found number bigger than 10
---------------------
2
3
4
in else - finished without calling break
not found number bigger than 10
---------------------
2
3
Exercise: Print all the locations in a string
- Create a file called location_in_string.py
- Given a string like "The black cat climbed the green tree.", print out the location of every "c" character.
Expected:
7
10
14
Exercise: Number guessing game
- Every level must include all the features from all the lower levels as well.
Level 0
- Create a file called number_guessing_game_0.py
- Using the random module the computer "thinks" about a whole number between 1 and 20.
- The user has to guess the number. After the user types in the guess the computer tells if this was bigger or smaller than the number it generated, or if was the same.
- The game ends after just one guess.
Level 1
- Create a file called number_guessing_game_1.py
- The user can guess several times. The game ends when the user guessed the right number.
Level 2
- Create a file called number_guessing_game_2.py
- If the user hits 'x', we leave the game without guessing the number.
Level 3
- Create a file called number_guessing_game_3.py
- If the user presses 's', show the hidden value (cheat)
Level 4
- Create a file called number_guessing_game_4.py
- Soon we'll have a level in which the hidden value changes after each guess. In order to make that mode easier to track and debug, first we would like to have a "debug mode".
- If the user presses 'd' the game gets into "debug mode": the system starts to show the current number to guess every time, just before asking the user for new input.
- Pressing 'd' again turns off debug mode. (It is a toggle each press on 'd' changes the value to to the other possible value.)
Level 5
- Create a file called number_guessing_game_5.py
- The 'm' button is another toggle. It is called 'move mode'. When it is 'on', the hidden number changes a little bit after every step (+/-2). That is, it is chaning by one of the following: -2, -1, 0, 1, 2. Pressing 'm' again will turn this feature off.
Level 6
- Create a file called number_guessing_game_6.py
- Let the user play several games.
- Pressing 'n' will skip this game and start a new one. Generates a new number to guess.
Exercise: Count unique characters
- Create file called count_unique_characters.py
- Given a string on the command line, count how many different characters it has.
python count_unique.py abcdaaa
4
Exercise: Convert for-loop to while-loop
-
Update the following file.
-
Given a for-loop as in the following code, convert it to be using a while-loop.
-
Range with 3 parameters: from the first number, till the second number, with step the 3rd number range(from, to, step)
for ix in range(3, 25, 4):
print(ix)
3
7
11
15
19
23
Solution: Print all the locations in a string
When you start thinking about this exercise, you probably call loc = text.find("c")
and then you wonder how could you find the next element.
After a while it might occur to you that the find
method can get a second parameter to set the location where we start the search.
Basically you need to call loc = text.find("c", loc + 1)
but that looks strange. How can you use loc
(as a parameter of the function) and also
assign to it. However programming languages don't have a problem with this as the assignment happens after the right-hand-side was fully executed.
The problem that now you have two different calls to find
. The first one and all the subsequent calls.
How could we merge the two calls?
The trick is that you need to have an initial value for the loc
variable and it has to be -1, so when we call find
for the first time,
it will start from the first character (index 0).
text = "The black cat climbed the green tree."
loc = -1
while True:
loc = text.find("c", loc+1)
if loc == -1:
break
print(loc)
Using an additional variable might make the code easier to read:
text = "The black cat climbed the green tree."
start = 0
while True:
loc = text.find("c", start)
if loc == -1:
break
print(loc)
start = loc + 1
Solution 1 for Number Guessing
import random
hidden = random.randrange(1, 21)
while True:
user_input = input("Please enter your guess: ")
print(user_input)
guess = int(user_input)
if guess == hidden:
print("Hit!")
break
if guess < hidden:
print("Your guess is too low")
else:
print("Your guess is too high")
Solution 2 for Number Guessing (x)
The main trick is that you check for the input being "x" before you try to convert it to an integer.
import random
hidden = random.randrange(1, 201)
while True:
user_input = input("Please enter your guess[x]: ")
print(user_input)
if user_input == 'x':
print("Sad to see you leaving early")
exit()
guess = int(user_input)
if guess == hidden:
print("Hit!")
break
if guess < hidden:
print("Your guess is too low")
else:
print("Your guess is too high")
Solution 3 for Number Guessing (s)
import random
hidden = random.randrange(1, 201)
while True:
user_input = input("Please enter your guess [x|s|d]: ")
print(user_input)
if user_input == 'x':
print("Sad to see you leaving early")
exit()
if user_input == 's':
print("The hidden value is ", hidden)
continue
guess = int(user_input)
if guess == hidden:
print("Hit!")
break
if guess < hidden:
print("Your guess is too low")
else:
print("Your guess is too high")
Solution for Number Guessing (debug)
One important thing is to remember that you can create a toggle by just calling not
on a boolean variable every time you'd like to flip the switch.
The other one is that flipping the switch (pressing d) and printing the current value because debug mode is on, are two separate operations that are not directly related and so they can be implemented separately.
import random
hidden = random.randrange(1, 201)
debug = False
while True:
if debug:
print("Debug: ", hidden)
user_input = input("Please enter your guess [x|s|d]: ")
print(user_input)
if user_input == 'x':
print("Sad to see you leaving early")
exit()
if user_input == 's':
print("The hidden value is ", hidden)
continue
if user_input == 'd':
debug = not debug
continue
guess = int(user_input)
if guess == hidden:
print("Hit!")
break
if guess < hidden:
print("Your guess is too low")
else:
print("Your guess is too high")
Solution for Number Guessing (move)
import random
UPPER_LIMIT = 200
hidden = random.randrange(1, UPPER_LIMIT + 1)
debug = False
move = False
while True:
if debug:
print(f"Debug: {hidden}")
print(f"Move: {move}")
if move:
mv = random.randrange(-2, 3)
if 1 <= hidden + mv <= UPPER_LIMIT:
hidden = hidden + mv
user_input = input("Please enter your guess [x|s|d|m]: ")
print(user_input)
if user_input == 'x':
print("Sad to see you leaving early")
exit()
if user_input == 's':
print("The hidden value is ", hidden)
continue
if user_input == 'd':
debug = not debug
continue
if user_input == 'm':
move = not move
continue
guess = int(user_input)
if guess == hidden:
print("Hit!")
break
if guess < hidden:
print("Your guess is too low")
else:
print("Your guess is too high")
Solution for Number Guessing (multi-game)
import random
debug = False
move = False
while True:
print("\nWelcome to another Number Guessing game")
hidden = random.randrange(1, 201)
while True:
if debug:
print("Debug: ", hidden)
if move:
mv = random.randrange(-2, 3)
hidden = hidden + mv
user_input = input("Please enter your guess [x|s|d|m|n]: ")
print(user_input)
if user_input == 'x':
print("Sad to see you leaving early")
exit()
if user_input == 's':
print("The hidden value is ", hidden)
continue
if user_input == 'd':
debug = not debug
continue
if user_input == 'm':
move = not move
continue
if user_input == 'n':
print("Giving up, eh?")
break
guess = int(user_input)
if guess == hidden:
print("Hit!")
break
if guess < hidden:
print("Your guess is too low")
else:
print("Your guess is too high")
Solution: Count unique characters
- set
import sys
if len(sys.argv) != 2:
exit("Need a string to count")
text = sys.argv[1]
unique = ''
for cr in text:
if cr not in unique:
unique += cr
print(len(unique))
The above solution works, but there is a better solution using sets that we have not learned yet. Nevertheless, let me show you that solution:
import sys
if len(sys.argv) != 2:
exit("Need a string to count")
text = sys.argv[1]
set_of_chars = set(text)
print(len(set_of_chars))
Solution: Convert for-loop to while-loop
ix = 3
while ix < 25:
print(ix)
ix += 4
Formatted strings
format - sprintf
- %
- %s
- f
- {}
- format
- sprintf
age = 42.12
name = 'Foo Bar'
str_concatenate = "The user " + name + " was born " + str(age) + " years ago."
print(str_concatenate)
str_percentage = "The user %s was born %s years ago." % (name, age)
print(str_percentage)
str_format = "The user {} was born {} years ago.".format(name, age)
print(str_format)
str_f_string = f"The user {name} was born {age} years ago."
print(str_f_string)
The user Foo Bar was born 42.12 years ago.
The user Foo Bar was born 42.12 years ago.
The user Foo Bar was born 42.12 years ago.
The user Foo Bar was born 42.12 years ago.
- When using % to print more than one value, put the values in parentheses forming a tuple.
- In version 2.6 and below you need to write {0} {1} etc, as a placeholder of the format method.
- f-strings were added in Python 3.6 (released on 2016-12-23)
printf using old %-syntax
-
printf
-
%
-
This slide is here only as a historical page so when you see the older ways of writing you'll know what you see.
-
It is recommended to use f-strings or if those cannot be used for some reason then use the format method.
v = 65
print("<%s>" % v) # <65>
print("<%10s>" % v) # < 65>
print("<%-10s>" % v) # <65 >
print("<%c>" % v) # <A>
print("<%d>" % v) # <65>
print("<%0.5d>" % v) # <00065>
Examples using format with names
txt = "Foo Bar"
num = 42.12
print("The user {name} was born {age} years ago.".format(name = txt, age = num))
The user Foo Bar was born 42.12 years ago.
Format columns
- format
In this example we use a list of lists that we have not learned yet, but don't worry about that for now. Focus on the output of the two print statements.
data = [
["Foo Bar", 42],
["Bjorg", 12345],
["Roza", 7],
["Long Name Joe", 3],
["Joe", 12345677889],
]
for entry in data:
print("{} {}".format(entry[0], entry[1]))
print('-' * 16)
for entry in data:
print("{:<8}|{:>7}".format(entry[0], entry[1]))
Foo Bar 42
Bjorg 12345
Roza 7
Long Name Joe 3
Joe 12345677889
----------------
Foo Bar | 42
Bjorg | 12345
Roza | 7
Long Name Joe| 3
Joe |12345677889
Examples using format - alignment
- format
txt = "Some text"
print("'{}'".format(txt)) # as is: 'Some text'
print("'{:12}'".format(txt)) # left: 'Some text '
print("'{:<12}'".format(txt)) # left: 'Some text '
print("'{:>12}'".format(txt)) # right: ' Some text'
print("'{:^12}'".format(txt)) # center: ' Some text '
Format - string
- format
- :s
name = "Foo Bar"
print("{:s}".format(name))
print("{}".format(name))
Foo Bar
Foo Bar
Format characters and types (binary, octal, hexa)
- format
- :b
- :c
- :d
- :o
- :x
- :X
- :n
val = 42
print("{:b}".format(val)) # binary: 101010
print("{:c}".format(val)) # character: *
print("{:d}".format(val)) # decimal: 42 (default)
print("{:o}".format(val)) # octal: 52
print("{:x}".format(val)) # hexa: 2a
print("{:X}".format(val)) # hexa: 2A
print("{:n}".format(val)) # number: 42
print("{}".format(val)) # 42 (same as decimal)
# Zero padding
print("'{:2n}'".format(3)) # ' 3'
print("'{:02n}'".format(3)) # '03'
print("'{:02n}'".format(14)) # '14'
# Zero padding hexa
print("'{:2X}'".format(3)) # ' 3'
print("'{:02X}'".format(3)) # '03'
print("'{:02X}'".format(14)) # '0E'
print("'{:02X}'".format(70)) # '46'
Format floating point number
- :e
- :E
- :f
- :F
- :g
- :G
- :n
x = 412.345678901
print("{:e}".format(x)) # exponent: 4.123457e+02
print("{:E}".format(x)) # Exponent: 4.123457E+02
print("{:f}".format(x)) # fixed point: 412.345679 (default precision is 6)
print("{:.2f}".format(x)) # fixed point: 412.35 (set precision to 2)
print("{:F}".format(x)) # same as f. 412.345679
print("{:g}".format(x)) # generic: 412.346 (default precision is 6)
print("{:G}".format(x)) # generic: 412.346
print("{:n}".format(x)) # number: 412.346
print("{}".format(x)) # defaults to g 412.345678901
Examples using format - indexing
- format
txt = "Foo Bar"
num = 42.12
print("The user {} was born {} years ago.".format(txt, num))
print("The user {0} was born {1} years ago.".format(txt, num))
print("The user {1} was born {0} years ago.".format(num, txt))
print("{0} is {0} and {1} years old.".format(txt, num))
The user Foo Bar was born 42.12 years ago.
The user Foo Bar was born 42.12 years ago.
The user Foo Bar was born 42.12 years ago.
Foo Bar is Foo Bar and 42.12 years old.
Format characters and types using f-format
val = 42
print(f"{val:b}") # binary: 101010
print(f"{val:c}") # character: *
print(f"{val:d}") # decimal: 42 (default)
print(f"{val:o}") # octal: 52
print(f"{val:x}") # hexa: 2a
print(f"{val:X}") # hexa: 2A
print(f"{val:n}") # number: 42
print(f"{val}") # 42 (same as decimal)
# Zero padding
val = 3
print(f"'{val:2n}'") # ' 3'
print(f"'{val:02n}'") # '03'
val = 14
print(f"'{val:02n}'") # '14'
# Zero padding hexa
val = 3
print(f"'{val:2X}'") # ' 3'
print(f"'{val:02X}'") # '03'
val = 14
print(f"'{val:02X}'") # '0E'
val = 70
print(f"'{val:02X}'") # '46'
f-format (formatted string literals)
- f
Since Python 3.6
name = "Foo Bar"
age = 42.12
pi = 3.141592653589793
r = 2
print(f"The user {name} was born {age} years ago.")
print(f"The user {name:10} was born {age} years ago.")
print(f"The user {name:>10} was born {age} years ago.")
print(f"The user {name:>10} was born {age:>10} years ago.")
print(f"PI is '{pi:.3}'.") # number of digits (defaults n = number)
print(f"PI is '{pi:.3f}'.") # number of digits after decimal point
print(f"Area is {pi * r ** 2}")
print(f"Area is {pi * r ** 2:.3f}")
The user Foo Bar was born 42.12 years ago.
The user Foo Bar was born 42.12 years ago.
The user Foo Bar was born 42.12 years ago.
The user Foo Bar was born 42.12 years ago.
PI is '3.14'.
PI is '3.142'.
Area is 12.566370614359172
Area is 12.566
Format floating point numbers using f-format
val = 412.345678901
print(f"{val:e}") # exponent: 4.123457e+02
print(f"{val:E}") # Exponent: 4.123457E+02
print(f"{val:f}") # fixed point: 412.345679 (default precision is 6)
print(f"{val:.2f}") # fixed point: 412.35 (set precision to 2)
print(f"{val:F}") # same as f. 412.345679
print(f"{val:g}") # generic: 412.346 (default precision is 6)
print(f"{val:G}") # generic: 412.346
print(f"{val:n}") # number: 412.346
print(f"{val}") # defaults to g 412.345678901
Format braces, bracket, and parentheses
These are just some extreme special cases. Most people won't need to know about them.
- To print
{
include{{
. - To print
}
include}}
.
print("{{{}}}".format(42)) # {42}
print("{{ {} }}".format(42)) # { 42 }
print("[{}] ({})".format(42, 42)) # [42] (42)
print("%{}".format(42)) # %42
Anything that is not in curly braces will be formatted as they are.
parameterized formatter
def formatter(value, filler, width):
return "{var:{fill}>{width}}".format(var=value, fill=filler, width=width)
text = formatter(23, "0", 7)
print(text)
print(formatter(42, " ", 7))
print(formatter(1234567, " ", 7))
0000023
42
1234567
format binary, octal, hexa numbers
a = 42
text = "{:b}".format(a)
print(text) # 101010
text = "{:#b}".format(a)
print(text) # 0b101010
a = 42
text = "{:o}".format(a)
print(text) # 52
text = "{:#o}".format(a)
print(text) # 0o52
a = 42
text = "{:x}".format(a)
print(text) # 2a
text = "{:#x}".format(a)
print(text) # 0x2a
text = "{:#X}".format(a)
print(text) # 0x2A
Examples using format with attributes of objects
This is also a rather strange example, I don't think I'd use it in real code.
import sys
print("{0.executable}".format(sys))
print("{system.argv[0]}".format(system = sys))
/home/gabor/venv3/bin/python
formatted_attributes.py
raw f-format
- f
- r
name="foo"
print(r"a\nb {name}")
print(rf"a\nb {name}")
print(fr"a\nb {name}") # this is better (for vim)
a\nb {name}
a\nb foo
a\nb foo
Format with conversion (stringifiation with str or repr)
Adding !s or !r in the place-holder we tell it to cal the str or repr method of the object, respectively.
- repr (repr) Its goal is to be unambiguous
- str (str) Its goal is to be readable
- The default implementation of both are useless
- Suggestion
- Difference between str and repr
class Point:
def __init__(self, a, b):
self.x = a
self.y = b
p = Point(2, 3)
print(p) # <__main__.Point object at 0x10369d750>
print("{}".format(p)) # <__main__.Point object at 0x10369d750>
print("{!s}".format(p)) # <__main__.Point object at 0x10369d750>
print("{!r}".format(p)) # <__main__.Point object at 0x10369d750>
class Point:
def __init__(self, a, b):
self.x = a
self.y = b
def __format__(self, spec):
#print(spec) // empty string
return("{{'x':{}, 'y':{}}}".format(self.x, self.y))
def __str__(self):
return("({},{})".format(self.x, self.y))
def __repr__(self):
return("Point({}, {})".format(self.x, self.y))
p = Point(2, 3)
print(p) # (2,3)
print("{}".format(p)) # {'x':2, 'y':3}
print("{!s}".format(p)) # (2,3)
print("{!r}".format(p)) # Point(2, 3)
Lists
Anything can be a list
- Comma separated values
- In square brackets
- Can be any value, and a mix of values: Integer, Float, Boolean, None, String, List, Dictionary, ...
- But usually they are of the same type:
- Distances of astronomical objects
- Chemical Formulas
- Filenames
- Names of devices
- Objects describing attributes of a network device.
- Actions to do on your data.
stuff = [42, 3.14, True, None, "Foo Bar", ['another', 'list'], {'a': 'Dictionary', 'language' : 'Python'}]
print(stuff)
Output:
[42, 3.14, True, None, 'Foo Bar', ['another', 'list'], {'a': 'Dictionary', 'language': 'Python'}]
Any layout
- Layout is flexible
- Trailing comma is optional. It does not disturb us. Nor Python.
more_stuff = [
42,
3.14,
True,
None,
"Foo Bar",
['another', 'list'],
{
'a': 'Dictionary',
'language' : 'Python',
},
]
print(more_stuff)
Output:
[42, 3.14, True, None, 'Foo Bar', ['another', 'list'], {'a': 'Dictionary', 'language': 'Python'}]
Access elements of a list
-
[]
-
len
-
Access single element: [index]
-
Access a sublist: [start:end]
-
Creates a copy of that sublist
planets = ['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn']
print(planets) # ['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn']
print(len(planets)) # 6
print(type(planets)) # <class 'list'>
print(planets[0]) # Mercury
print(type(planets[0])) # <class 'str'>
print(planets[3]) # Mars
print(planets[0:2]) # ['Mercury', 'Venus']
print(planets[1:4]) # ['Venus', 'Earth', 'Mars']
print(planets[0:1]) # ['Mercury']
print(type(planets[0:1])) # <class 'list'>
print(planets[2:]) # ['Earth', 'Mars', 'Jupiter', 'Saturn']
print(planets[:3]) # ['Mercury', 'Venus', 'Earth']
print(planets[:]) # ['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn']
List slice with steps
- List slice with step: [start:end:step]
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
print(letters[::]) # ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
print(letters[::1]) # ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
print(letters[::2]) # ['a', 'c', 'e', 'g', 'i']
print(letters[1::2]) # ['b', 'd', 'f', 'h', 'j']
print(letters[2:8:2]) # ['c', 'e', 'g']
print(letters[1:20:3]) # ['b', 'e', 'h']
print(letters[20:30:3]) # []
print(letters[8:3:-2]) # ['i', 'g', 'e']
Change a List
- :
fruits = ['apple', 'banana', 'peach', 'strawberry']
print(fruits) # ['apple', 'banana', 'peach', 'strawberry']
fruits[0] = 'orange'
print(fruits) # ['orange', 'banana', 'peach', 'strawberry']
print(fruits[1:3]) # ['banana', 'peach']
fruits[1:3] = ['grape', 'kiwi']
print(fruits) # ['orange', 'grape', 'kiwi', 'strawberry']
print(fruits[1:3]) # ['grape', 'kiwi']
fruits[1:3] = ['mango']
print(fruits) # ['orange', 'mango', 'strawberry']
print(fruits[1:2]) # ['mango']
fruits[1:2] = ["banana", "peach"]
print(fruits) # ['orange', 'banana', 'peach', 'strawberry']
print(fruits[1:1]) # []
fruits[1:1] = ['apple', 'pineapple']
print(fruits) # ['orange', 'apple', 'pineapple', 'banana', 'peach', 'strawberry']
- Unlike strings, lists are mutable. You can change the content of a list by assigning values to its elements.
- You can use the slice notation to change several elements at once.
- You can even have different number of elements in the slice and in the replacement. This will also change the length of the array.
Change sublist vs change element of a list
fruits = ['orange', 'mango', 'strawberry']
print(fruits[1:2]) # ['mango']
fruits[1:2] = ["banana", "peach"]
print(fruits) # ['orange', 'banana', 'peach', 'strawberry']
print(fruits[1])
print(fruits[2])
fruits = ['orange', 'mango', 'strawberry']
print(fruits[1]) # mango
fruits[1] = ["banana", "peach"]
print(fruits) # ['orange', ['banana', 'peach'], 'strawberry']
print(fruits[1]) # ['banana', 'peach']
print(fruits[2]) # strawberry
print(fruits[1][0]) # banana
Change with steps
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
print(numbers) # [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
print(numbers[1::2]) # [2, 4, 6, 8, 10, 12]
numbers[1::2] = [0, 0, 0, 0, 0, 0]
print(numbers) # [1, 0, 3, 0, 5, 0, 7, 0, 9, 0, 11, 0]
numbers[1::2] = [42] * 6
print(numbers) # [1, 42, 3, 42, 5, 42, 7, 42, 9, 42, 11, 42]
List assignment and list copy
- [:]
fruits = ['apple', 'banana', 'peach', 'kiwi']
salad = fruits
fruits[0] = 'orange'
print(fruits) # ['orange', 'banana', 'peach', 'kiwi']
print(salad) # ['orange', 'banana', 'peach', 'kiwi']
- There is one list in the memory and two pointers to it.
- If you really want to make a copy the pythonic way is to use the slice syntax.
- It creates a shallow copy.
fruits = ['apple', 'banana', 'peach', 'kiwi']
salad = fruits[:]
fruits[0] = 'orange'
print(fruits) # ['orange', 'banana', 'peach', 'kiwi']
print(salad) # ['apple', 'banana', 'peach', 'kiwi']
Shallow vs. Deep copy of lists
- copy
- deepcopy
copy.copy() # shallow copy
copy.deepcopy() # deep copy
fruits = ['apple', ['banana', 'peach'], 'kiwi']
print(fruits) # ['apple', ['banana', 'peach'], 'kiwi']
print(fruits[0]) # apple
print(fruits[1][0]) # banana
salad = fruits[:]
fruits[0] = 'orange'
fruits[1][0] = 'mango'
print(fruits) # ['orange', ['mango', 'peach'], 'kiwi']
print(salad) # ['apple', ['mango', 'peach'], 'kiwi']
from copy import deepcopy
fruits = ['apple', ['banana', 'peach'], 'kiwi']
print(fruits) # ['apple', ['banana', 'peach'], 'kiwi']
print(fruits[0]) # apple
print(fruits[1][0]) # banana
salad = deepcopy(fruits)
fruits[0] = 'orange'
fruits[1][0] = 'mango'
print(fruits) # ['orange', ['mango', 'peach'], 'kiwi']
print(salad) # ['apple', ['banana', 'peach'], 'kiwi']
join
- join
fruits = ['apple', 'banana', 'peach', 'kiwi']
together = ':'.join(fruits)
print(together) # apple:banana:peach:kiwi
together = ' '.join(fruits)
print(together) # apple banana peach kiwi
mixed = ' -=<> '.join(fruits)
print(mixed) # apple -=<> banana -=<> peach -=<> kiwi
another = ''.join(fruits)
print(another) # applebananapeachkiwi
csv = ','.join(fruits)
print(csv) # apple,banana,peach,kiwi
- For real CSV use: csv
join list of numbers
a = ["x", "2", "y"]
b = ["x", 2, "y"]
print(":".join(a)) # x:2:y
# print ":".join(b) # TypeError: sequence item 1: expected string, int found
# convert elements to string using map
print(":".join( map(str, b) )) # x:2:y
# convert elements to string using list comprehension
print(":".join( str(x) for x in b )) # x:2:y
split
-
split
-
list
-
Special case: To split a string to its characters: Use the list() function.
-
Split using more than one splitter: use re.split
words = "ab:cd::ef".split(':')
print(words) # ['ab', 'cd', '', 'ef']
by_space = "foo bar baz".split(' ')
print(by_space) # ['foo', '', '', 'bar', 'baz']
# special case: split by spaces
names = "foo bar baz".split()
print(names) # ['foo', 'bar', 'baz']
# special case: split to characters
chars = list("ab cd")
print(chars) # ['a', 'b', ' ', 'c', 'd']
for loop on lists
- for
- in
things = ['apple', 'banana', 'peach', 42]
for var in things:
print(var)
Output:
apple
banana
peach
42
in list
- in
Check if the value is in the list?
words = ['apple', 'banana', 'peach', '42']
if 'apple' in words:
print('found apple')
if 'a' in words:
print('found a')
else:
print('NOT found a')
if 42 in words:
print('found 42')
else:
print('NOT found 42')
# found apple
# NOT found a
# NOT found 42
Where is the element in the list
- index
words = ['cat', 'dog', 'snake', 'camel']
print(words.index('snake'))
print(words.index('python'))
Output:
2
Traceback (most recent call last):
File "examples/lists/index.py", line 6, in <module>
print(words.index('python'))
ValueError: 'python' is not in list
Index improved
- index
words = ['cat', 'dog', 'snake', 'camel']
name = 'snake'
if name in words:
print(words.index(name))
name = 'python'
if name in words:
print(words.index(name))
[].insert
- insert
- unshift
words = ['apple', 'banana', 'cat']
print(words) # ['apple', 'banana', 'cat']
words.insert(2, 'zebra')
print(words) # ['apple', 'banana', 'zebra', 'cat']
words.insert(0, 'dog')
print(words) # ['dog', 'apple', 'banana', 'zebra', 'cat']
# Instead of this, use append (next slide)
words.insert(len(words), 'olifant')
print(words) # ['dog', 'apple', 'banana', 'zebra', 'cat', 'olifant']
[].append
- append
names = ['Foo', 'Bar', 'Zorg', 'Bambi']
print(names) # ['Foo', 'Bar', 'Zorg', 'Bambi']
names.append('Qux')
print(names) # ['Foo', 'Bar', 'Zorg', 'Bambi', 'Qux']
[].remove
- remove
names = ['Joe', 'Kim', 'Jane', 'Bob', 'Kim']
print(names) # ['Joe', 'Kim', 'Jane', 'Bob', 'Kim']
print(names.remove('Kim')) # None
print(names) # ['Joe', 'Jane', 'Bob', 'Kim']
print(names.remove('George'))
# Traceback (most recent call last):
# File "examples/lists/remove.py", line 9, in <module>
# print(names.remove('George')) # None
# ValueError: list.remove(x): x not in list
Remove first element from a list given by its value. Throws an exception if there is no such element in the list.
Remove element by index [].pop
- pop
planets = ['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter']
print(planets) # ['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter']
third = planets.pop(2)
print(third) # Earth
print(planets) # ['Mercury', 'Venus', 'Mars', 'Jupiter']
last = planets.pop()
print(last) # Jupiter
print(planets) # ['Mercury', 'Venus', 'Mars']
# planets.pop(4) # IndexError: pop index out of range
jupyter_landers = []
# jupyter_landers.pop() # IndexError: pop from empty list
Remove and return the last element of a list. Throws an exception if the list was empty.
Remove first element of list
- pop
- shift
To remove an element by its index, use the slice syntax:
names = ['foo', 'bar', 'baz', 'moo']
first = names.pop(0)
print(first) # foo
print(names) # ['bar', 'baz', 'moo']
Remove several elements of list by index
- slice
To remove an element by its index, use the slice syntax:
names = ['foo', 'bar', 'baz', 'moo', 'qux']
names[2:4] = []
print(names) # ['foo', 'bar', 'qux']
Use list as a queue - FIFO
a_queue = []
print(a_queue)
a_queue.append('Moo')
print(a_queue)
a_queue.append('Bar')
print(a_queue)
first = a_queue.pop(0)
print(first)
print(a_queue)
Output:
[]
['Moo']
['Moo', 'Bar']
Moo
['Bar']
Queue using deque from collections
- collections
- deque
- append
- popleft
from collections import deque
fruits = deque()
print(type(fruits)) # <type 'collections.deque'>
print(fruits) # deque([])
print(len(fruits)) # 0
fruits.append('Apple')
print(fruits) # deque(['Apple'])
print(len(fruits)) # 1
fruits.append('Banana')
fruits.append('Peach')
print(fruits) # deque(['Apple', 'Banane', 'Peach'])
print(len(fruits)) # 3
nxt = fruits.popleft()
print(nxt) # 'Apple'
print(fruits) # deque(['Banana', 'Peach'])
print(len(fruits)) # 2
if fruits:
print("The queue has items")
else:
print("The queue is empty")
nxt = fruits.popleft()
nxt = fruits.popleft()
if fruits:
print("The queue has items")
else:
print("The queue is empty")
Output:
<class 'collections.deque'>
deque([])
0
deque(['Apple'])
1
deque(['Apple', 'Banana', 'Peach'])
3
Apple
deque(['Banana', 'Peach'])
2
The queue has items
The queue is empty
- .append
- .popleft
- len() number of elements
- if q: to see if it has elements or if it is empty
- dequeue
Fixed size queue
- maxlen
from collections import deque
queue = deque([], maxlen = 3)
print(len(queue)) # 0
print(queue.maxlen) # 3
queue.append("Foo")
queue.append("Bar")
queue.append("Baz")
print(queue) # deque(['Foo', 'Bar', 'Baz'], maxlen=3)
queue.append("Zorg") # Automatically removes the left-most (first) element
print(queue) # deque(['Bar', 'Baz', 'Zorg'], maxlen=3)
List as a stack - LIFO
stack = []
stack.append("Joe")
print(stack)
stack.append("Jane")
print(stack)
stack.append("Bob")
print(stack)
while stack:
name = stack.pop()
print(name)
print(stack)
Output:
['Joe']
['Joe', 'Jane']
['Joe', 'Jane', 'Bob']
Bob
['Joe', 'Jane']
Jane
['Joe']
Joe
[]
stack with deque
from collections import deque
stack = deque()
stack.append("Joe")
stack.append("Jane")
stack.append("Bob")
while stack:
name = stack.pop()
print(name)
# Bob
# Jane
# Joe
Exercies: Queue
-
Create file called queue_of_people.py
-
The application should manage a queue of people.
-
It will prompt the user for a new name by printing :, the user can type in a name and press ENTER. The app will add the name to the queue.
-
If the user types in "n" then the application will remove the first name from the queue and print it.
-
If the user types in "x" then the application will print the list of users who were left in the queue and it will exit.
-
If the user types in "s" then the application will show the current number of elements in the queue.
$ python queue_of_people.py
: Joe
: Jane
: Mary
: s
3
: n
next is Joe
: n
next is Jane
: Peter
: n
next is Mary
: n
next is Peter
: n
the queue is empty
: Bar
: Tal
: x
Left in the queue: Bar, Tal
$
Exercise: Stack
- Create file called reverse_polish_calculator.py
- Implement a Reverse Polish Calculator
2
3
4
+
*
=
14
x = eXit, s = Show, [+-*/=]
:23
:19
:7
:8
:+
:3
:-
:/
:s
[23.0, -0.631578947368421]
:+
:=
22.36842105263158
:s
[]
:x
Exercise: MasterMind
-
Create file called mastermind.py
-
Implement the Master Mind board game.
-
The computer "thinks" a number with 4 different digits.
-
The user guesses which digits.
-
For every digit that matched both in value, and in location the computer gives a
*
. -
For every digit that matches in value, but not in space the computer gives you a
+
. -
The user tries to guess the given number in as few guesses as possible.
Computer:
2153 (this is hidden)
User Response
2467 * (because 2 is in the right place but none of the other digits match)
2715 *++ (because 2 is in the right place. 1 and 5 are used but in the wrong place. 7 not in use)
- Wordle is basically the same game, just with letters and the extra limitation that each guess must be a valid word.
Solution: Queue with list
queue = []
while True:
inp = input(":")
inp = inp.rstrip("\n")
if inp == 'x':
for name in queue:
print(name)
exit()
if inp == 's':
print(len(queue))
continue
if inp == 'n':
if len(queue) > 0:
print("next is {}".format(queue.pop(0)))
else:
print("the queue is empty")
continue
queue.append(inp)
Solution: Queue with deque
from collections import deque
queue = deque()
while True:
inp = input(":")
inp = inp.rstrip("\n")
if inp == 'x':
for name in queue:
print(name)
exit()
if inp == 's':
print(len(queue))
continue
if inp == 'n':
if len(queue) > 0:
print("next is {}".format(queue.popleft()))
else:
print("the queue is empty")
continue
queue.append(inp)
Solution: Reverse Polish calculator (stack) with lists
stack = []
print("x = eXit, s = Show, [+-*/=]")
while True:
val = input(':')
if val == 's':
print(stack)
continue
if val == 'x':
break
if val == '+':
a = stack.pop()
b = stack.pop()
stack.append(a+b)
continue
if val == '-':
a = stack.pop()
b = stack.pop()
stack.append(a-b)
continue
if val == '*':
a = stack.pop()
b = stack.pop()
stack.append(a*b)
continue
if val == '/':
a = stack.pop()
b = stack.pop()
stack.append(a/b)
continue
if val == '=':
print(stack.pop())
continue
stack.append(float(val))
Solution: Reverse Polish calculator (stack) with deque
from collections import deque
stack = deque()
while True:
val = input(':')
if val == 'x':
break
if val == '+':
a = stack.pop()
b = stack.pop()
stack.append(a+b)
continue
if val == '*':
a = stack.pop()
b = stack.pop()
stack.append(a*b)
continue
if val == '=':
print(stack.pop())
continue
stack.append(float(val))
Solution: MasterMind
import random
import sys
width = 4
# TODO: verify that the user gave exactly width characters
def main():
hidden = list(map(str, random.sample(range(10), width)))
print(f"Hidden numbers: {hidden}")
while True:
inp = input("Guess a number: (e.g. 1234) or x to eXit. ")
if inp == 'x' or inp == 'X':
exit()
guess = list(inp)
print(guess)
result = []
for ix in range(len(hidden)):
if guess[ix] == hidden[ix]:
result += '*'
elif guess[ix] in hidden:
result += '+'
print(result)
if result == ['*'] * width:
print("SUCCESS")
break
main()
MasterMind to debug
Debug the following version of the MasterMind game.
import random
def number_generator():
y = [0, 0, 0, 0]
for i in range(0, 4):
y[i] = random.randrange(0, 10)
# print(y)
if i:
number += str(y[i])
else:
number = str(y[i])
# print(number)
return number
def user_input():
x = input("Type in 4 digits number:")
if len(x) == 4:
return x
else:
print("wrong input")
user_input()
def string_compare(x, y):
r = 0
q = 0
for i in range(0, 4):
if x[i] == y[i]:
r += 1
continue
for j in range(0, 4):
if x[i] == y[j]:
if i == j:
continue
else:
q += 1
break
return r, q
def print_result(r):
print("")
for i in range(0, r[0]):
print("*", end="")
for i in range(0, r[1]):
print("+", end="")
print("\n")
def main():
comp = number_generator()
result = 0
while True:
user = user_input()
result = string_compare(comp, user)
print_result(result)
# print(result)
if result[0] == 4:
print("Correct!")
return
main()
Debugging Queue
The following implementation has a bug. (Even though the n was supposed to remove the element and the code seems to mean that it does, we still see two items after we removed the first.)
The question is how to debug this?
q = []
while True:
name=input("your name: ")
if name=="n":
print(q.pop(0))
if name=="x":
print(q)
exit()
if name=="s":
print(len(q))
exit()
else:
q.append(name)
continue
your name: Foo
your name: Bar
your name: n
Foo
your name: s
2
sort
- sort
planets = ['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn']
print(planets) # ['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn']
planets.sort()
print(planets) # ['Earth', 'Jupiter', 'Mars', 'Mercury', 'Saturn', 'Venus']
planets.sort(reverse=True)
print(planets) # ['Venus', 'Saturn', 'Mercury', 'Mars', 'Jupiter', 'Earth']
sort numbers
- sort
- key
- abs
numbers = [7, 2, -4, 19, 8]
print(numbers) # [7, 2, -4, 19, 8]
numbers.sort()
print(numbers) # [-4, 2, 7, 8, 19]
numbers.sort(reverse=True)
print(numbers) # [19, 9, 7, 2, -4]
numbers.sort(key=abs, reverse=True)
print(numbers) # [19, 9, 7, -4, 2]
key sort of strings
-
key
-
len
-
Another example for using a key.
-
To sort the list according to length
animals = ['chicken', 'cow', 'snail', 'elephant']
print(animals)
animals.sort()
print(animals)
animals.sort(key=len)
print(animals)
animals.sort(key=len, reverse=True)
print(animals)
Output:
['chicken', 'cow', 'snail', 'elephant']
['chicken', 'cow', 'elephant', 'snail']
['cow', 'snail', 'chicken', 'elephant']
['elephant', 'chicken', 'snail', 'cow']
sort mixed values
mixed = [100, 'foo', 42, 'bar']
print(mixed)
mixed.sort()
print(mixed)
In Python 3 it throws an exception.
Output:
[100, 'foo', 42, 'bar']
Traceback (most recent call last):
File "examples/lists/sort_mixed.py", line 5, in <module>
mixed.sort()
TypeError: unorderable types: str() < int()
Python 2 puts the numbers first in numerical order and then the strings in ASCII order.
[100, 'foo', 42, 'bar']
[42, 100, 'bar', 'foo']
sort mixed values fixed with str
mixed = [100, 'foo', 42, 'bar']
print(mixed)
mixed.sort(key=str)
print(mixed)
sorting with sorted
- sorted
animals = ['chicken', 'cow', 'snail', 'elephant']
print(animals) # ['chicken', 'cow', 'snail', 'elephant']
srd = sorted(animals)
print(srt) # ['chicken', 'cow', 'elephant', 'snail']
print(animals) # ['chicken', 'cow', 'snail', 'elephant']
rev = sorted(animals, reverse=True, key=len)
print(rev) # ['elephant', 'chicken', 'snail', 'cow']
print(animals) # ['chicken', 'cow', 'snail', 'elephant']
sort vs. sorted
The sort() method will sort a list in-place and return None. The built-in sorted() function will return the sorted list and leave the original list intact.
Sorted and change - shallow copy
-
Sorted creates a shallow copy of the original list
-
If the list elements are simple values that creates a copy
planets = ["Mercury", "Venus", "Earth"]
other_planets = planets
sorted_planets = sorted(planets)
planets[0] = "Jupiter"
print(planets)
print(other_planets)
print(sorted_planets)
Output:
['Jupiter', 'Venus', 'Earth']
['Jupiter', 'Venus', 'Earth']
['Earth', 'Mercury', 'Venus']
- If some of the elements are complex structures (list, dictionaries, etc.) then the internal structures are not copied.
- One can use
copy.deepcopy
to make sure the whole structure is separated, if that's needed.
planets = [
["Mercury", 1],
["Venus", 2],
["Earth", 3],
["Earth", 2]
]
other_planets = planets
sorted_planets = sorted(planets)
print(sorted_planets)
planets[0][1] = 100
print(planets)
print(other_planets)
print(sorted_planets)
Output:
[['Earth', 2], ['Earth', 3], ['Mercury', 1], ['Venus', 2]]
[['Mercury', 100], ['Venus', 2], ['Earth', 3], ['Earth', 2]]
[['Mercury', 100], ['Venus', 2], ['Earth', 3], ['Earth', 2]]
[['Earth', 2], ['Earth', 3], ['Mercury', 100], ['Venus', 2]]
Sorting characters of a string
letters = 'axzb'
print(letters) # 'axzb'
srt = sorted(letters)
print(srt) # ['a', 'b', 'x', 'z']
print(letters) # 'axzb'
rev = ''.join(srt)
print(rev) # abxz
# in one statement:
rev = ''.join(sorted(letters))
print(rev) # abxz
range
- range
for ix in range(11, 19, 2):
print(ix)
# 11
# 13
# 15
# 17
for ix in range(5, 7):
print(ix)
# 5
# 6
for ix in range(3):
print(ix)
# 0
# 1
# 2
for ix in range(19, 11, -2):
print(ix)
# 19
# 17
# 15
# 13
Looping over index
planets = ['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn']
for var in planets:
print(var)
Output:
Mercury
Venus
Earth
Mars
Jupiter
Saturn
planets = ['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn']
for ix in range(len(planets)):
print(ix, planets[ix])
Output:
0 Mercury
1 Venus
2 Earth
3 Mars
4 Jupiter
5 Saturn
Enumerate lists
- enumerate
planets = ['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn']
for idx, planet in enumerate(planets):
print(idx, planet)
Output:
0 Mercury
1 Venus
2 Earth
3 Mars
4 Jupiter
5 Saturn
List operators
a = ['one', 'two']
b = ['three']
print(a) # ['one', 'two']
print(a * 2) # ['one', 'two', 'one', 'two']
print(2 * a) # ['one', 'two', 'one', 'two']
print(a + b) # ['one', 'two', 'three']
print(b + a) # ['three', 'one', 'two']
List of lists
x = ['abc', 'def']
print(x) # ['abc', 'def']
y = [x, 'xyz']
print(y) # [['abc', 'def'], 'xyz']
print(y[0]) # ['abc', 'def']
print(x[0]) # abc
print(y[0][0]) # abc
List assignment
List assignment works in "parallel" in Python.
x, y = 1, 2
print(x) # 1
print(y) # 2
x, y = y, x
print(x) # 2
print(y) # 1
def stats(num):
return sum(num), sum(num)/len(num), min(num), max(num)
total, average, minimum, maximum = stats([2, 3, 4])
print(total, average, minimum, maximum) # 9 3.0 2 4
x,y = f() # works if f returns a list of 2 elements
It will throw a run-time ValueError exception if the number of values in the returned list is not 2. (Both for fewer and for more return values).
List documentation
Exercise: color selector menu
- In a script called color_selector_menu.py have a list of colors. Write a script that will display a menu (a list of numbers and the corresponding color) and prompts the user for a number. The user needs to type in one of the numbers. That's the selected color.
- blue
- green
- yellow
- white
- For extra credit make sure the system is user-proof and it won't blow up on various incorrect input values. (e.g Floating point number. Number that is out of range, non-number)
- For more credit allow the user to supply the number of the color on the command line. python color_selector_menu.py 3. If that is available, don't prompt.
- For further credit allow the user to provide the name of the color on the command line: python color_selector_menu.py yellow Can you handle color names that are not in the expected case (e.g. YelloW)?
- Any more ideas for improvement?
Exercise: count digits
Create a script called count_digits_in_lists.py that given a list of numbers count how many times each digit appears? The output will look like this:
0 1
1 3
2 3
3 2
4 1
5 2
6 2
7 0
8 1
9 1
- Use this skeleton
numbers = [1203, 1256, 312456, 98]
Exercise: Create list
-
Create a script called create_list.py that given a list of strings with words separated by spaces, create a single list of all the words.
-
Skeleton:
lines = [
'grape banana mango',
'nut orange peach',
'apple nut banana apple mango',
]
# ....
print(fruits)
# ....
print(unique_fruites)
- Expected result:
['grape', 'banana', 'mango', 'nut', 'orange', 'peach', 'apple', 'nut', 'banana', 'apple', 'mango']
Then create a list of unique values sorted in alphabetical order.
Expected result:
['apple', 'banana', 'grape', 'mango', 'nut', 'orange', 'peach']
Exercise: Count words
- Create a script called count_words_in_lists.py that given a list of words (for now embedded in the program itself) will count how many times each word appears.
celestial_objects = [
'Moon', 'Gas', 'Asteroid', 'Dwarf', 'Asteroid', 'Moon', 'Asteroid'
]
Expected output:
Moon 2
Gas 1
Asteroid 3
Dwarf 1
Exercise: Check if number is prime
Write a program called is_prime.py that gets a number on the command line a prints "True" if the number is a prime number or "False" if it isn't.
python is_prime.py 42
False
python is_prime.py 19
True
Exercise: DNA sequencing
-
Create a file called dna_sequencing.py
-
A, C, T, G are called bases or nucleotides
-
Accept a sequence on the command line like this: python dna_sequencing.py ACCGXXCXXGTTACTGGGCXTTGTXX
-
Given a sequence such as the one above (some nucleotides mixed up with other elements represented by an X)
-
First return the sequences containing only ACTG. The above string can will be changed to ['ACCG', 'C', 'GTTACTGGGC', 'TTGT'].
-
Then sort them by lenght. Expected result: ['GTTACTGGGC', 'ACCG', 'TTGT', 'C']
-
Create a file called extended_dna_sequencing.py
-
In this case the original string contains more than on type of foreign elements: e.g. 'ACCGXXTXXYYGTTQRACQQTGGGCXTTGTXX'.
-
Expected output: ['TGGGC', 'ACCG', 'TTGT', 'GTT', 'AC', 'T']
-
Ask for a sequence on the Standard Input (STDIN) like this:
python extended_dna_sequencing.py
Please type in a sequence:
Solution: menu
colors = ['blue', 'yellow', 'black', 'purple']
for ix in range(len(colors)):
print("{}) {}".format(ix+1, colors[ix]))
selection = input("Select color: ")
if not selection.isdecimal():
exit(f"We need a number between 1 and {len(colors)}")
if int(selection) < 1 or int(selection) > len(colors):
exit(f"The number must be between 1 and {len(colors)}")
col = int(selection) - 1
print(colors[col])
-
We would like to show a menu where each number corresponds to one element of the list so this is one of the places where we need to iterate over the indexes of a list.
-
len(colors)
gives us the length of the list (in our case 4) -
range(len(colors))
is the range of numbers between 0 and 4 (in our case), meaning 0, 1, 2, 3. -
(Sometimes people explicitly write 4 in this solution, but if later we change the list and include another color we'll have to remember updating this number as well. This is error prone and it is very easy to deduct this number from the data we already have. (The list.))
-
We start the list from 0, but when we display the menu we would like to show the numbers 1-4 to make it more human friendly. Therefore we show
ix+1
and the color from locationsix
. -
We ask for input and save it in a variable.
-
We use the
isdecimal
method to check if the user typed in a decimal number. We give an error and exit if not. -
Then we check if the users provided a number in the correct range of values. We give an error and exit if not.
-
then we convert the value to the correct range of numbers (remember, the user sees and selects numbers between 1-4 and we need them between 0-3).
Solution: count digits
numbers = [1203, 1256, 312456, 98]
count = [0] * 10 # same as [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
for num in numbers:
for char in str(num):
count[int(char)] += 1
for d in range(0, 10):
print("{} {}".format(d, count[d]))
First we have to decide where are we going to store the counts. A 10 element long list seems to fit our requirements so if we have 3 0s and 2 8s we would have [3, 0, 0, 0, 0, 0, 0, 0, 2, 0]
.
-
We have a list of numbers.
-
We need a place to store the counters. For this we create a variable called counter which is a list of 10 0s. We are going to count the number of times the digit 3 appears in
counters[3]
. -
We iterate over the numbers so
num
is the current number. (e.g. 1203) -
We would like to iterate over the digits in the current number now, but if we write
for var in num
we will get an errorTypeError: 'int' object is not iterable
becausenum
is a number, but numbers are not iterables, so we we cannot iterate over them. So we need to convert it to a string usingstr
. -
On each iteration
char
will be one character (which in or case we assume that will be a digit, but still stored as a string). -
int(char)
will convert the string to a number so for example "2" will be converted to 2. -
count[int(char)]
is going to bechar[2]
ifchar
is "2". That's the location in the list where we count how many times the digit 2 appears in our numbers. -
We increment it by one as we have just encountered a new copy of the given digit.
-
That finished the data collection.
-
The second for-loop iterates over all the "possible digits" that is from 0-9, prints out the digit and the counter in the respective place.
Solution: Create list
- unique
- sorted
- set
lines = [
'grape banana mango',
'nut orange peach',
'apple nut banana apple mango',
]
one_line = ' '.join(lines)
print(one_line)
fruits = one_line.split()
print(fruits)
unique_fruits = []
for word in fruits:
if word not in unique_fruits:
unique_fruits.append(word)
print(sorted(unique_fruits))
# a simpler way using a set, but we have not learned sets yet.
unique = sorted(set(fruits))
print(unique)
Solution: Count words
celestial_objects = [
'Moon', 'Gas', 'Asteroid', 'Dwarf', 'Asteroid', 'Moon', 'Asteroid'
]
names = []
counter = []
for name in celestial_objects:
if name in names:
idx = names.index(name)
counter[idx] += 1
else:
names.append(name)
counter.append(1)
for i in range(len(names)):
print("{:12} {}".format(names[i], counter[i]))
celestial_objects = [
'Moon', 'Gas', 'Asteroid', 'Dwarf', 'Asteroid', 'Moon', 'Asteroid'
]
names = []
counter = []
for name in celestial_objects:
for idx in range(len(names)):
if name == names[idx]:
counter[idx] += 1
break
else:
names.append(name)
counter.append(1)
for i in range(len(names)):
print("{:12} {}".format(names[i], counter[i]))
Solution: Check if number is prime
import sys
n = int(sys.argv[1])
#print(n)
is_prime = True
for i in range(2, int( n ** 0.5) + 1):
if n % i == 0:
is_prime = False
break
print(is_prime)
# math.sqrt(n) might be clearer than n ** 0.5
Solution: DNA sequencing
def get_sequences(dna):
sequences = dna.split('X')
sequences.sort(key=len, reverse=True)
print(sequences)
new_seq = []
for w in sequences:
if len(w) > 0:
new_seq.append(w)
return new_seq
if __name__ == '__main__':
dna = 'ACCGXXCXXGTTACTGGGCXTTGT'
short_sequences = get_sequences(dna)
print(short_sequences)
Solution: DNA sequencing other
from dna_sequencing import get_sequences
if __name__ == '__main__':
dna = 'ACCGXXTXXYYGTTQRACQQTGGGCXTTGTXX'
filtered = []
for cr in dna:
if cr in 'ACGT':
filtered.append(cr)
else:
filtered.append('X')
#print(filtered)
dna = ''.join(filtered)
short_sequences = get_sequences(dna)
print(short_sequences)
Solution: DNA sequencing using replace
from dna_sequencing import get_sequences
if __name__ == '__main__':
dna = 'ACCGXXTXXYYGTTQRACQQTGGGCXTTGTXX'
bad_letters = []
for cr in dna:
if cr not in 'ACTGX' and cr not in bad_letters:
bad_letters.append(cr)
for cr in bad_letters:
while cr in dna:
dna = dna.replace(cr, 'X')
short_sequences = get_sequences(dna)
print(short_sequences)
Solution: DNA sequencing using regex
import re
from dna_sequencing import get_sequences
if __name__ == '__main__':
dna = 'ACCGXXTXXYYGTTQRACQQTGGGCXTTGTXX'
dna = re.sub(r'[^ACTGX]+', 'X', dna)
short_sequences = get_sequences(dna)
print(short_sequences)
Solution: DNA sequencing with filter
dna = 'ACCGXXCXXGTTACTGGGCXTTGT'
sequences = dna.split('X')
sequences.sort(key=len, reverse=True)
def not_empty(x):
return len(x) > 0
print(sequences)
sequences = list( filter(not_empty, sequences) )
print(sequences)
Solution: DNA sequencing with filter and lambda
dna = 'ACCGXXCXXGTTACTGGGCXTTGT'
sequences = dna.split('X')
sequences.sort(key=len, reverse=True)
print(sequences)
sequences = list( filter(lambda x: len(x) > 0, sequences) )
print(sequences)
[].extend
- extend
names = ['Foo Bar', 'Orgo Morgo']
names.extend(['Joe Doe', 'Jane Doe'])
print(names) # ['Foo Bar', 'Orgo Morgo', 'Joe Doe', 'Jane Doe']
append vs. extend
What is the difference between [].append and [].extend ? The method append adds its parameter as a single element to the list, while extend gets a list and adds its content.
names = ['Foo Bar', 'Orgo Morgo']
more = ['Joe Doe', 'Jane Doe']
names.extend(more)
print(names) # ['Foo Bar', 'Orgo Morgo', 'Joe Doe', 'Jane Doe']
names = ['Foo Bar', 'Orgo Morgo']
names.append(more)
print(names) # ['Foo Bar', 'Orgo Morgo', ['Joe Doe', 'Jane Doe']]
names = ['Foo', 'Bar']
names.append('Qux')
print(names) # ['Foo', 'Bar', 'Qux']
names = ['Foo', 'Bar']
names.extend('Qux')
print(names) # ['Foo', 'Bar', 'Q', 'u', 'x']
split and extend
When collecting data which is received from a string via splitting, we would like to add the new elements to the existing list:
lines = [
'abc def ghi',
'hello world',
]
collector = []
for l in lines:
collector.extend(l.split())
print(collector)
# ['abc', 'def', 'ghi']
# ['abc', 'def', 'ghi', 'hello', 'world']
Tuples
Create tuple
- tuple
- ()
Tuple
- A tuple is a fixed-length immutable list. It cannot change its size or content.
- Can be accessed by index, using the slice notation.
- A tuple is denoted with parentheses: (1,2,3)
planets = ('Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn')
print(planets)
print(planets[1])
print(planets[1:3])
planets.append("Death Star")
print(planets)
Output:
('Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn')
Venus
('Venus', 'Earth')
Traceback (most recent call last):
File "/home/gabor/work/slides/python/examples/lists/tuple.py", line 6, in <module>
tpl.append("Death Star")
AttributeError: 'tuple' object has no attribute 'append'
List
- Elements of a list can be changed via their index or via the list slice notation.
- A list can grow and shrink using append and pop methods or using the slice notation.
- A list is denoted with square brackets: [1, 2, 3]
planets = ['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn']
print(planets)
print(planets[1])
print(planets[1:3])
planets.append("Death Star")
print(planets)
Output:
['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn']
Venus
['Venus', 'Earth']
['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn', 'Death Star']
Tuples are rarely used. There are certain places where Python or some module require tuple (instead of list) or return a tuple (instead of a list) and in each place it will be explained. Otherwise you don't need to use tuples.
e.g. keys of dictionaries can be tuple (but not lists).
Convert list to tuple and tuple to list
planets = ['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn']
print(planets)
print(planets.__class__.__name__)
tpl = tuple(planets)
print(tpl)
print(tpl.__class__.__name__)
lst = list(tpl)
print(lst)
print(lst.__class__.__name__)
Output:
['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn']
list
('Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn')
tuple
['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn']
list
Enumerate returns tuples
- enumerate
- tuple
planets = ['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn']
enu = enumerate(planets)
print(type(enu).__name__)
print(enu)
#for t in enu:
# print(t)
for ix, planet in enu:
print(ix, planet)
#print('-----')
#
#element = next(enu)
#print(type(element))
#print(element)
#
#print('-----')
#
#for tpl in enumerate(planets):
# print(tpl[0], tpl[1])
#
Output:
enumerate
<enumerate object at 0x7f7ede7e37c0>
-----
<class 'tuple'>
(0, 'Mercury')
-----
0 Mercury
1 Venus
2 Earth
3 Mars
4 Jupiter
5 Saturn
Change a tuple
z = ([1, 2], [3, 4])
print(z) # ([1, 2], [3, 4])
z[0].append(5)
print(z) # ([1, 2, 5], [3, 4])
# z[0] = [7, 8] # TypeError: 'tuple' object does not support item assignment
# z.append(7) # AttributeError: 'tuple' object has no attribute 'append'
Sort tuples
students = [
('John', 'A', 2),
('John', 'B', 2),
('John', 'A', 3),
('Anne', 'B', 1),
('Anne', 'A', 2),
('Anne', 'A', 1),
]
print(students)
print(sorted(students))
"""
[
('Anne', 'A', 1),
('Anne', 'A', 2),
('Anne', 'B', 1),
('John', 'A', 2),
('John', 'A', 3),
('John', 'B', 2)
]
"""
Sort tuples by specific elements
Sorting tuples or list, or other complex structures
students = [
('John', 'A', 2),
('Zoro', 'C', 1),
('Dave', 'B', 3),
]
print(students)
# [('John', 'A', 2), ('Zoro', 'C', 1), ('Dave', 'B', 3)]
print(sorted(students))
# [('Dave', 'B', 3), ('John', 'A', 2), ('Zoro', 'C', 1)]
# sort by the first element of each tuple
print(sorted(students, key=lambda s : s[1]))
# [('John', 'A', 2), ('Dave', 'B', 3), ('Zoro', 'C', 1)]
# sort by the 2nd element of the tuples (index 1)
print(sorted(students, key=lambda s : s[2]))
# [('Zoro', 'C', 1), ('John', 'A', 2), ('Dave', 'B', 3)]
# sort by the 3rd element of the tuples (index 2)
from operator import itemgetter
print(sorted(students, key=itemgetter(2)))
# [('Zoro', 'C', 1), ('John', 'A', 2), ('Dave', 'B', 3)]
# maybe this is more simple than the lambda version
# and probably faster
Sort and secondary sort
We have a list of words. It is easy to sort them by length, but what will be the order among the words that have the same length?
A sort using a lambda-function that returns a tuple can provide the secondary sort order.
planets1 = ['Mercury', 'Venus', 'Earth', 'Mars', 'Jupiter', 'Saturn']
planets2 = ['Mercury', 'Earth', 'Venus', 'Mars', 'Jupiter', 'Saturn']
print(sorted(planets1, key=len))
# ['Mars', 'Venus', 'Earth', 'Saturn', 'Mercury', 'Jupiter']
print(sorted(planets2, key=len))
# ['Mars', 'Earth', 'Venus', 'Saturn', 'Mercury', 'Jupiter']
print(sorted(planets1, key=lambda w: (len(w), w)))
# ['Mars', 'Earth', 'Venus', 'Saturn', 'Jupiter', 'Mercury']
print(sorted(planets2, key=lambda w: (len(w), w)))
# ['Mars', 'Earth', 'Venus', 'Saturn', 'Jupiter', 'Mercury']
Files
File types: Text vs Binary
You probably know many file types such as Images (png, jpg, ...), Word, Excel, mp3, mp4, csv, and now also .py files. Internally there are two big categories. Text and Binary files. Text files are the ones that look readable if you open them with a plain text editor such as Notepad. Binary files will look like a mess if you opened them in Noetpad.
For Binary files you need a special application to "look" at their content. For example the Excel and Word programs for the appropriate files. Some image viewer application to view all the images. VLC to look at an mp4. Some application to hear the content of mp3 files.
- Text: Can make sense when opened with Notepad: .txt, csv, .py, .pl, ..., HTML , XML, YAML, JSON
- Binary: Need specialized tool to make sense of it: Images, Zip files, Word, Excel, .exe, mp3, mp4
In Python you have specialized modules for each well-knonw binary type to handle the files of that format. Text files on the other hand can be handled by low level file-reading functions, however even for those we usually have modules that know how to read and interpret the specific formats. (e.g. CSV, HTML, XML, YAML, JSON parsers)
Open vs. Read vs. Load
The expression "open a file" has two distinct meanings for programmers and users of software. For a user of Word, for example, "open the file" would mean to be able to see its content in a formatted way inside the editor.
When a programmer - now acting as a regular user - opens a Python file in an editor such as Notepad++ or Pycharm, the expectation is to see the content of that program with nice colors.
However in order to provide this the programmer behind these applications had to do several things.
- Connect to a file on the disk (aka. "opening the file" in programmer speak).
- Read the content of the file from the disk to memory.
- Format the content read from the file as expected by the user of that application.
Binary files: Images
This is just a quick example how to use the Pillow module to handle images. There is a whole chapter on dealing with images.
pip install pillow
from PIL import Image
import sys
if len(sys.argv) != 3 and len(sys.argv) != 4:
exit(f"Usage: {sys.argv[0]} FILENAME %CHANGE OUTFILE")
in_file = sys.argv[1]
change = float(sys.argv[2])
out_file = sys.argv[3] if len(sys.argv) == 4 else None
img = Image.open(in_file) # opening file and reading meta
print(img.size) # a tuple
print(img.size[0]) # width
print(img.size[1]) # height
width = int(change * img.size[0] / 100)
height = int(change * img.size[1] / 100)
out = img.resize((width, height))
#print("image size: ", sys.getsizeof(list(img.im)))
print("image size: ", sys.getsizeof(img.getdata()))
print("image size: ", sys.getsizeof(img.im))
out.show()
print("image size: {}", sys.getsizeof(out.im))
if out_file:
out.save(out_file)
python examples/files/get_image_size.py examples/pil/first.png
Output:
(800, 450)
800
450
48
1080033
Reading an Excel file
There are many ways to deal with Excel files as well.
pip install openpyxl
import openpyxl
import sys
if len(sys.argv) !=2:
exit(f"Usage: {sys.argv[0]} FILENAME")
in_file = sys.argv[1]
wb = openpyxl.load_workbook(filename = in_file)
for ws in wb.worksheets:
print(ws.title)
ws = wb.worksheets[0]
print(ws['A1'].value)
Reading a YAML file
YAML files are often used as configuuration files.
# A comment
Course:
Language:
Name: Ladino
IETF BCP 47: lad
For speakers of:
Name: English
IETF BCP 47: en
Special characters: []
Modules:
- basic/
- words/
- verbs/
- grammar/
- names/
- sentences/
pip install pyyaml
import yaml
filename = "data.yaml"
with open(filename) as fh:
data = yaml.load(fh, Loader=yaml.Loader)
print(data)
Read and analyze a text file
{% embed include file="src/examples/files/text_report.txt)
import sys
if len(sys.argv) != 2:
exit(f"Usage: {sys.argv[0]} FILENAME")
filename = sys.argv[1]
total = 0
with open(filename, "r") as fh:
for row in fh:
if "Report" not in row:
continue
text, value = row.split(":")
# print(value)
value = float(value.strip())
# print(value)
total += value
print(total)
Open and read file (easy but not recommended)
In some code you will encounter the following way of opening files. This was used before "with" was added to the language. It is not a recommended way of opening a file as you might easily forget to call "close" and that might cause trouble. For example you might loose data. Don't do that.
I am showing this as the first example, because it is easuer to understand.
filename = 'examples/files/numbers.txt'
fh = open(filename, 'r')
for line in fh:
print(line)
fh.close()
Open and read file using with (recommended)
- open
- close
- with
filename = 'examples/files/numbers.txt'
with open(filename, 'r') as fh: # open(filename) would be enough
for line in fh:
print(line) # duplicate newlines
# close is called when we leave the 'with' context
Read file remove newlines
- trim
- rstrip
- chomp
filename = 'examples/files/numbers.txt'
with open(filename, 'r') as fh:
for line in fh:
line = line.rstrip("\n")
print(line)
Filename on the command line
import sys
def main():
if len(sys.argv) != 2:
exit(f"Usage: {sys.argv[0]} FILENAME")
filename = sys.argv[1]
with open(filename) as fh:
print("Working on the file", filename)
main()
$ python single.py
Usage: single.py FILENAME
$ python single.py numbers.txt
Working on the file numbers.txt
Filehandle with return
import sys
def process_file(filename):
with open(filename, 'r') as fh:
for line in fh:
line = line.rstrip("\n")
if len(line) > 0 and line[0] == '#':
return
if len(line) > 1 and line[0:2] == '//':
return
# process the line
print(line)
process_file(sys.argv[0])
Read all the lines into a list
- readlines
There are rare cases when you need the whole content of a file in the memory and you cannot process it line by line.
In those rare cases we have several options. readlines
will read the whole content into a list converting each line
from the file to be an element in the list.
Beware though, if the file is too big, it might not fit in the free memory of the computer.
filename = 'examples/files/numbers.txt'
with open(filename, 'r') as fh:
lines = fh.readlines() # reads all the lines into a list
print(f"number of lines: {len(lines)}")
for line in lines:
print(line, end="")
print('------')
lines.reverse()
for line in lines:
print(line, end="")
Output:
number of lines: 2
23 345 12345
67 189 23 17
------
67 189 23 17
23 345 12345
Read all the characters into a string (slurp)
- read
In some other cases, especially if you are looknig for some pattern that starts on one line but ends on another line.
you'd be better off having the whole file as a single string in a variable. This is where the read
method comes in handy.
It can also be used to read in chunks of the file.
filename = 'examples/files/numbers.txt'
with open(filename, 'r') as fh:
content = fh.read() # reads all the lines into a string
print(type(content))
print(len(content)) # number of characters in file
print(content) # the content of the file
Output:
<class 'str'>
26
23 345 12345
67 189 23 17
read(20) will read 20 bytes.
Not existing file
- IOError
filename = 'examples/files/unicorns.txt'
with open(filename, 'r') as fh:
lines = fh.read()
print("still running")
# Traceback (most recent call last):
# File "examples/files/open_file.py", line 5, in <module>
# with open(filename, 'r') as fh:
# IOError: [Errno 2] No such file or directory: 'examples/files/unicorns.txt'
# Traceback (most recent call last):
# File "examples/files/open_file.py", line 3, in <module>
# with open(filename, 'r') as fh:
# FileNotFoundError: [Errno 2] No such file or directory: 'examples/files/unicorns.txt'
Open file exception handling
- try
- except
Exception handling
filename = 'examples/files/unicorns.txt'
try:
with open(filename, 'r') as fh:
lines = fh.read()
except Exception as err:
print('There was some error in the file operations.')
print(err)
print(type(err).__name__)
print('Still running.')
Output:
There was some error in the file operations.
[Errno 2] No such file or directory: 'examples/files/unicorns.txt'
FileNotFoundError
Still running.
Open many files - exception handling
import sys
def main():
for filename in sys.argv[1:]:
try:
do_some_stuff(filename)
except Exception as err:
print(f"trouble with '{filename}': Error: {err}")
def do_some_stuff(filename):
with open(filename) as fh:
total = 0
count = 0
for line in fh:
number = float(line)
total += number
count += 1
print("Average: ", total/count)
main()
23
1
192
17
1
2
3
4
5
6
$ python average_from_files.py number_per_line.txt empty.txt number_per_line2.txt
Average: 58.25
trouble with 'empty.txt': Error: division by zero
Average: 3.5
$ python average_from_files.py numbers.txt
trouble with 'numbers.txt': Error: could not convert string to float: '23 345 12345\n'
$ python average_from_files.py more_numbers.txt
trouble with 'more_numbers.txt': Error: [Errno 2] No such file or directory: 'more_numbers.txt'
Writing to file
- open
- write
In order to write to a file we open it passing the "w" write mode. If the file did not exist it will try to create it.
If the file already existed it will remove all its content so after such call to open
we'll end up with an empty
file if we don't write into it.
Once the file is opened we can use the write
method to write to it. This will NOT automatically append a newline
at the end so we'll have to include \n
if we would like to insert a newline.
Opening the file will fail if we don't have write permissions or if the folder in which we are trying to create the file does not exist.
filename = 'data.txt'
with open(filename, 'w') as out:
out.write('text\n')
Print to file
- open
We can also use the print
function to print (or write) to a file. In this case the same rules apply as printing to
standard output (automatically adding a trailing newline, inserting a space between parameters). We do this
by passing the file-handle as the value of the file
parameter of print
.
filename = 'out.txt'
with open(filename, 'w') as fh:
print("Hello", "World", file=fh)
Append to file
- append
filename = 'data.txt'
with open(filename, 'a') as out:
out.write('append more text\n')
Binary mode
- rb
import sys
if len(sys.argv) != 2:
exit("Need name of file")
filename = sys.argv[1]
try:
with open(filename, 'rb') as fh:
while True:
binary_str = fh.read(1000)
print(len(binary_str))
if len(binary_str) == 0:
break
# do something with the content of the binary_str
except Exception:
pass
python examples/files/read_binary.py examples/pil/first.png
1000
1000
1000
1000
1000
775
0
Does file exist? Is it a file?
-
os.path.exists
-
os.path.isfile
-
os.path.isdir
Direct access of a line in a file
names = ['Foo', 'Bar', 'Baz']
for name in names:
print(name)
print(names[1])
Output:
Foo
Bar
Baz
Bar
import sys
if len(sys.argv) != 2:
exit(f"Run {sys.argv[0]} FILENAME")
filename = sys.argv[1]
# We can iterate over the lines
#with open(filename, 'r') as fh:
# for line in fh:
# print(line)
# We cannot access an element
with open(filename, 'r') as fh:
print(fh[2])
Traceback (most recent call last):
File "examples/files/fh_access.py", line 14, in <module>
print(fh[2])
TypeError: '_io.TextIOWrapper' object is not subscriptable
This does NOT work because files can only be accessed sequentially.
import sys
if len(sys.argv) != 2:
exit(f"Run {sys.argv[0]} FILENAME")
filename = sys.argv[1]
with open(filename, 'r') as fh:
rows = fh.readlines()
print(rows[2])
import sys
if len(sys.argv) != 2:
exit(f"Run {sys.argv[0]} FILENAME")
filename = sys.argv[1]
with open(filename, 'r') as fh:
count = 0
for row in fh:
if count == 2:
break
count += 1
print(row)
Exercise: count digits
23 345 12345
67 189 23 17
- Given the file examples/files/numbers.txt (or a similar file), create a file called count_digits_in_file.py that will count how many times each digit appears? The output will look like this. Just different values.
- Save the results in a file called report.txt.
0 0
1 3
2 3
3 4
4 2
5 2
6 1
7 2
8 1
9 1
Exercise: remove newlines
- Create a file called remove_newlines.py that will be able to read all the lines of a given file into a list and remove trailing newlines.
Exercise: print lines with Report
In many cases you get some text report in some free form of text (and not in a CSV file or an Excel file.) You need to extract the information from such a file after recognizing the patterns. This exercise tries to provide such a case.
- Create a script called text_report.py
Given a file that looks like this:
This is a text report there are some lines that start with
Report: 23
Other linese has this somewhere in the middle.
Begin report
Report: -3
Like this. Report: 17
More lines starting with
Report: 44
End report
We will have some exercise with this file. Maybe 4 exercises.
Report: 123
-
Print out the first line that starts with
Report:
. -
Print out all the lines that have the string
Report:
in it. -
Print out all the lines that start with the string
Report:
. -
Print out the numbers that are after
Report:
. (e.g.Report: 42
print out 42) -
Add the numbers that after after the string
Report:
. So in the above example the result is expected to be 204. -
Do the same, but only take account lines between the
Begin report
andEnd report
section. (sum expected to be 58)
Exercise: color selector
- Create a file similar to the colors.txt file and use it as the list of colors in the earlier example where we prompted for a color.
- Call the new script color_selector_file.py
blue
yellow
white
green
Extend the previous example by letting the user provide the name of the file on the command line:
python color.py examples/files/color.txt
Exercise: ROT13
-
rot13
-
Implement ROT13:
-
Create a script called rot13_file.py that given a file on the command line it will replace the content with the rot13 of it of it.
Exercise: Combine lists
Tomato=78
Avocado=23
Pumpkin=100
Cucumber=17
Avocado=10
Cucumber=10
Write a script called combine_lists.py that takes the two files and combines them adding the values for each vegetable. The expected result is:
Avocado=33
Cucumber=27
Pumpkin=100
Tomato=78
Exercise: Number guessing game - save to file
Level 7
- Create a file called number_guessing_game_7.py
- Based on the previous solutions.
- When starting a new game ask the user for their name and save the game information in the file.
- The hidden number and the guesses.
- Have an option to show the previously played games.
Solution: count numbers
import sys
if len(sys.argv) < 2:
exit("Need name of file.")
counter = [0] * 10
filename = sys.argv[1]
with open(filename) as fh:
for line in fh:
for c in line.rstrip("\n"):
if c == ' ':
continue
c = int(c)
counter[c] += 1
for i in range(10):
print("{} {}".format(i, counter[i]))
Solution: remove newlines
import sys
filename = sys.argv[0]
with open(filename) as fh:
lines = []
for line in fh:
lines.append(line.rstrip("\n"))
print(lines)
Solution: print lines with Report
import sys
def main():
if len(sys.argv) !=2:
exit(f"Usage: {sys.argv[0]} FILENAME")
# text_report.txt
in_file = sys.argv[1]
show_rows_with_report(in_file)
show_rows_start_with_report(in_file)
show_numbers_after_report(in_file)
sum_numbers_after_report(in_file)
sum_numbers_after_report_within_begin_end_section(in_file)
def show_rows_with_report(in_file):
with open(in_file) as fh:
for row in fh:
row = row.rstrip("\n")
if 'Report:' in row:
print(row)
print('-' * 20)
def show_rows_start_with_report(in_file):
with open(in_file) as fh:
for row in fh:
row = row.rstrip("\n")
if row.startswith('Report:'):
print(row)
print('-' * 20)
def show_numbers_after_report(in_file):
with open(in_file) as fh:
for row in fh:
row = row.rstrip("\n")
if 'Report:' in row:
parts = row.split(':')
print(int(parts[1]))
print('-' * 20)
def sum_numbers_after_report(in_file):
total = 0
with open(in_file) as fh:
for row in fh:
row = row.rstrip("\n")
if 'Report:' in row:
parts = row.split(':')
total += int(parts[1])
print(f"Total: {total}")
print('-' * 20)
def sum_numbers_after_report_within_begin_end_section(in_file):
in_section = False
total = 0
with open(in_file) as fh:
for row in fh:
row = row.rstrip("\n")
if row == 'Begin report':
in_section = True
continue
if row == 'End report':
in_section = False
continue
if in_section:
if 'Report:' in row:
parts = row.split(':')
total += int(parts[1])
print(int(parts[1]))
print(f"Total in section: {total}")
print('-' * 20)
main()
Solution: color selector
def main():
try:
with open('colors.txt') as fh:
colors = []
for line in fh:
colors.append(line.rstrip("\n"))
except IOError:
print("Could not open colors.txt")
exit()
for i in range(len(colors)):
print("{}) {}".format(i, colors[i]))
c = int(input("Select color: "))
print(colors[c])
main()
Solution: ROT13
- rot13
import sys
import codecs
if len(sys.argv) != 2:
exit(f"Usage: {sys.argv[0]} FILENAME")
filename = sys.argv[1]
with open(filename, 'r') as fh:
original = fh.read()
encoded = codecs.encode(original, encoding='rot_13')
#print(encoded)
with open(filename, 'w') as fh:
fh.write(encoded)
Solution: Combine lists
files = ['examples/files/a.txt', 'examples/files/b.txt']
names = []
values = []
for filename in files:
with open(filename) as fh:
for line in fh:
name, value = line.rstrip("\n").split("=")
value = int(value)
if name in names:
idx = names.index(name)
values[idx] += value
else:
names.append( name )
values.append( value )
with open('out.txt', 'w') as fh:
for ix in range(len(names)):
fh.write("{}={}\n".format(names[ix], values[ix]))
Solution: Combine lists with tuple
- zip
- tuple
files = ['examples/files/a.txt', 'examples/files/b.txt']
names = []
values = []
for filename in files:
with open(filename) as fh:
for line in fh:
name, value = line.rstrip("\n").split("=")
value = int(value)
if name in names:
idx = names.index(name)
values[idx] += value
else:
names.append( name )
values.append( value )
pairs = []
for ix in range(len(names)):
pairs.append((names[ix], values[ix]))
# for name, value in zip(names, values):
# pairs.append((name, value))
print(pairs)
print(sorted(pairs))
print(sorted(pairs, key=lambda p: p[1]))
with open('out.txt', 'w') as fh:
for name, value in pairs:
fh.write("{}={}\n".format(name, value))
Filehandle using with and not using it
- open
- close
- with
filename = 'examples/files/numbers.txt'
fh = open(filename, 'r')
print(fh) # <open file 'numbers.txt', mode 'r' at 0x107084390>
data = fh.read()
# do something with the data
fh.close()
print(fh) # <closed file 'numbers.txt', mode 'r' at 0x107084390>
with open(filename, 'r') as fh:
print(fh) # <open file 'numbers.txt', mode 'r' at 0x1070840c0>
data = fh.read()
print(fh) # <closed file 'numbers.txt', mode 'r' at 0x1070840c0>
Dictionary (hash)
What is a dictionary
- Unordered key-value pairs.
- Keys are immutables (numbers, strings, tuples).
- Values can be any object.
When to use dictionaries
- ID to Name mapping.
- Object to Count mapping.
- Name of a feature to value of the feature.
- Name of an attribute to value of the attribute.
Various dictionary examples
person_1 = {
'fname': 'Moshe',
'lname': 'Cohen',
'email': 'moshe@cohen.com',
'children': ['Maya', 'Tal'],
}
person_2 = {
'fname': 'Dana',
'lname': 'Levy',
'email': 'dana@levy.com',
'phone': '123-456',
}
from person import person_1, person_2
people = [person_1, person_2]
print(people[0]['fname'])
for person in people:
print(person)
print('----------------')
people_by_name = {
'Moshe Cohen': 'moshe@cohen.com',
'Dana Levy': 'dana@levy.com',
}
print(people_by_name['Dana Levy'])
for name, email in people_by_name.items():
print(f"{name} -> {email}")
print('----------------')
full_people_by_name = {
'Moshe': person_1,
'Dana': person_2,
}
print(full_people_by_name['Moshe']['lname'])
print(full_people_by_name['Dana'])
for fname, data in full_people_by_name.items():
print(fname)
print(data)
Moshe
{'fname': 'Moshe', 'lname': 'Cohen', 'email': 'moshe@cohen.com', 'children': ['Maya', 'Tal']}
{'fname': 'Dana', 'lname': 'Levy', 'email': 'dana@levy.com', 'phone': '123-456'}
----------------
dana@levy.com
Moshe Cohen -> moshe@cohen.com
Dana Levy -> dana@levy.com
----------------
Cohen
{'fname': 'Dana', 'lname': 'Levy', 'email': 'dana@levy.com', 'phone': '123-456'}
Moshe
{'fname': 'Moshe', 'lname': 'Cohen', 'email': 'moshe@cohen.com', 'children': ['Maya', 'Tal']}
Dana
{'fname': 'Dana', 'lname': 'Levy', 'email': 'dana@levy.com', 'phone': '123-456'}
Dictionary
-
dictionary
-
dict
-
{}
-
We can start from an empty dictionary and then fill it witg key-value pairs.
user = {}
user['name'] = 'Foobar'
print(user) # {'name': 'Foobar'}
user['email'] = 'foo@bar.com'
print(user) # {'name': 'Foobar', 'email': 'foo@bar.com'}
the_name = user['name']
print(the_name) # Foobar
field = 'name'
the_value = user[field]
print(the_value) # Foobar
user['name'] = 'Edith Piaf'
print(user) # {'name': 'Edith Piaf', 'email': 'foo@bar.com'}
Create dictionary
- We can also start with a dictionary that already has some data in it.
user = {
'fname': 'Foo',
'lname': 'Bar',
}
print(user) # {'lname': 'Bar', 'fname': 'Foo'}
user['email'] = 'foo@bar.com'
keys
-
keys
-
Sometimes we don't know up front what keys we might have
Jupiter:300
Saturn:500
Earth:0
import sys
if len(sys.argv) != 2:
exit(f"Usage: {sys.argv[0]} FILENAME")
filename = sys.argv[1]
planets = {}
with open(filename) as fh:
for row in fh:
row = row.rstrip("\n")
# print(row)
# planet, distance = row.split(":")
tpl = row.split(":")
if len(tpl) != 2:
print(f"not good {row}")
#exit(1)
continue
planet, distance = tpl
# print(planet)
planets[planet] = distance
print(planets) #
print(planets.keys()) #
print(list(planets.keys())) #
- Keys are returned in seemingly random order.
Loop over keys
- keys
user = {
'fname': 'Foo',
'lname': 'Bar',
}
for key in user.keys():
print(key)
# lname
# fname
for key in user.keys():
print(f"{key} -> {user[key]}")
# lname -> Bar
# fname -> Foo
Loop over dictionary keys
Looping over the "dictionary" is just like looping over the keys, but personally I prefer when we use the somedictionary.keys()
expression.
user = {
'fname': 'Foo',
'lname': 'Bar',
}
for key in user:
print(f"{key} -> {user[key]}")
# lname -> Bar
# fname -> Foo
Loop using items
- items
people = {
"Tal" : "123",
"Maya" : "456",
"Ruth" : "789",
}
for name, uid in people.items():
print(f"{name} => {uid}")
Tal => 123
Maya => 456
Ruth => 789
user = {
'fname': 'Foo',
'lname': 'Bar',
}
for tpl in user.items(): # iterates on tuples
print(f"{tpl[0]} -> {tpl[1]}")
print("{} -> {}".format(*tpl))
# fname -> Foo
# fname -> Foo
# lname -> Bar
# lname -> Bar
values
-
values
-
Values are returned in the same random order as the keys are.
user = {
'fname': 'Foo',
'lname': 'Bar',
'workplace': 'Bar',
}
print(user) # {'fname': 'Foo', 'lname': 'Bar', 'workplace': 'Bar'}
print(user.keys()) # dict_keys(['fname', 'lname', 'workplace'])
print(user.values()) # dict_values(['Foo', 'Bar', 'Bar'])
Not existing key
If we try to fetch the value of a key that does not exist, we get an exception.
def main():
user = {
'fname': 'Foo',
'lname': 'Bar',
}
print(user['fname'])
print(user['email'])
main()
Foo
Traceback (most recent call last):
File "examples/dictionary/no_such_key.py", line 11, in <module>
main()
File "examples/dictionary/no_such_key.py", line 9, in main
print(user['email'])
KeyError: 'email'
Get key
- get
If we use the get
method, we get None
if the key does not exist.
user = {
'fname': 'Foo',
'lname': 'Bar',
'address': None,
}
print(user.get('fname')) # Foo - because 'fname' has the value 'Foo'
print(user.get('email')) # None - because 'email' does not exist
print(user.get('address')) # None - because 'address' has the value None
# set a default value to return
print(user.get('fname', 'ABC')) # Foo - because the value of 'fname' is 'Foo'
print(user.get('answer', 42)) # 42 - because 'answer' does not exist
print(user.get('address', 23)) # None - because None is the value of the 'address' key
Foo
None
None
Foo
42
None
None
will be interpreted as False
, if checked as a boolean.
Does the key exist?
- exists
- in
user = {
'fname': 'Foo',
'lname': 'Bar',
'answer': None,
}
print('fname' in user) # True
print('email' in user) # False
print('answer' in user) # True
print('Foo' in user) # False
for attr in ['fname', 'email', 'lname']:
if attr in user:
print(f"{attr} => {user[attr]}")
# fname => Foo
# lname => Bar
True
False
False
fname => Foo
lname => Bar
Does the value exist?
- values
user = {
'fname': 'Foo',
'lname': 'Bar',
}
print('fname' in user.values()) # False
print('Foo' in user.values()) # True
False
True
Delete key
- del
- pop
user = {
'fname': 'Foo',
'lname': 'Bar',
'email': 'foo@bar.com',
}
print(user) # {'lname': 'Bar', 'email': 'foo@bar.com', 'fname': 'Foo'}
fname = user['fname']
del user['fname']
print(fname) # Foo
print(user) # {'lname': 'Bar', 'email': 'foo@bar.com'}
lname_was = user.pop('lname')
print(lname_was) # Bar
print(user) # {'email': 'foo@bar.com'}
{'fname': 'Foo', 'lname': 'Bar', 'email': 'foo@bar.com'}
Foo
{'lname': 'Bar', 'email': 'foo@bar.com'}
Bar
{'email': 'foo@bar.com'}
List of dictionaries
people = [
{
'name' : 'Foo Bar',
'email' : 'foo@example.com'
},
{
'name' : 'Tal Bar',
'email' : 'tal@example.com',
'address' : 'Borg, Country',
'children' : [
'Alpha',
'Beta'
]
}
]
children = people[1]['children']
# print(people)
print(people[0]['name'])
print(people[1]['children'][0])
people[1]['children'].append('Gamma')
print(children)
print(list(map(lambda p: p['name'], people)))
people[0]['children'] = ['Zorg', 'Buzz']
Foo Bar
Alpha
['Alpha', 'Beta', 'Gamma']
['Foo Bar', 'Tal Bar']
Shared dictionary
people = [
{
"name" : "Foo",
"id" : "1",
},
{
"name" : "Bar",
"id" : "2",
},
{
"name" : "Moo",
"id" : "3",
},
]
by_name = {}
by_id = {}
for person in people:
by_name[ person['name' ] ] = person
by_id[ person['id' ] ] = person
print(by_name)
print(by_id)
print('-------------------')
print(by_name["Foo"])
by_name["Foo"]['email'] = 'foo@weizmann.ac.il'
people[0]["name"] = "Foooooo";
print(by_name)
print(by_id)
print(by_name["Foo"]) # the key remained Foo !!!!
print(by_id["1"])
{'Foo': {'name': 'Foo', 'id': '1'}, 'Bar': {'name': 'Bar', 'id': '2'}, 'Moo': {'name': 'Moo', 'id': '3'}}
{'1': {'name': 'Foo', 'id': '1'}, '2': {'name': 'Bar', 'id': '2'}, '3': {'name': 'Moo', 'id': '3'}}
-------------------
{'name': 'Foo', 'id': '1'}
{'Foo': {'name': 'Foooooo', 'id': '1', 'email': 'foo@weizmann.ac.il'}, 'Bar': {'name': 'Bar', 'id': '2'}, 'Moo': {'name': 'Moo', 'id': '3'}}
{'1': {'name': 'Foooooo', 'id': '1', 'email': 'foo@weizmann.ac.il'}, '2': {'name': 'Bar', 'id': '2'}, '3': {'name': 'Moo', 'id': '3'}}
{'name': 'Foooooo', 'id': '1', 'email': 'foo@weizmann.ac.il'}
{'name': 'Foooooo', 'id': '1', 'email': 'foo@weizmann.ac.il'}
immutable collection: tuple as dictionary key
points = {}
p1 = (2, 3)
points[p1] = 'Joe'
points[(17, 5)] = 'Jane'
print(points)
for k in points.keys():
print(k)
print(k.__class__.__name__)
print(points[k])
{(2, 3): 'Joe', (17, 5): 'Jane'}
(2, 3)
tuple
Joe
(17, 5)
tuple
Jane
immutable numbers: numbers as dictionary key
number = {
23 : "Twenty three",
17 : "Seventeen",
3.14 : "Three dot fourteen",
42 : "The answer",
}
print(number)
print(number[42])
print(number[3.14])
{23: 'Twenty three', 17: 'Seventeen', 3.14: 'Three dot fourteen', 42: 'The answer'}
The answer
Three dot fourteen
Sort a dictionary
When people says "sort a dictionary" they usually mean sorting the keys of the dictionary, but what does it mean in Python if we call sorted
on a dictionary?
scores = {
'Foo' : 10,
'Bar' : 34,
'Miu' : 88,
'Abc' : 34,
}
print(scores) # {'Foo': 10, 'Bar': 34, 'Miu': 88, 'Abc': 34}
sorted_names = sorted(scores) # "sort dictionary" sorts the keys
print(sorted_names) # ['Abc', 'Bar', 'Foo', 'Miu']
sorted_keys = sorted(scores.keys())
print(sorted_keys) # ['Abc', 'Bar', 'Foo', 'Miu']
Sort dictionary values
scores = {
'Foo' : 10,
'Bar' : 34,
'Miu' : 88,
'Abc' : 34,
}
# sort the values, but we cannot get the keys back!
sorted_values = sorted(scores.values())
print(sorted_values) # [10, 34, 34, 88]
Sort dictionary by value
- Sort the keys by the values
scores = {
'Foo' : 10,
'Bar' : 34,
'Miu' : 88,
'Abc' : 34,
}
def by_value(x):
return scores[x]
sorted_names = sorted(scores.keys(), key=by_value)
print(sorted_names) # ["Foo", "Bar", "Abc", "Miu"]
# sort using a lambda expression
sorted_names = sorted(scores.keys(), key=lambda x: scores[x])
print(sorted_names) # ["Foo", "Bar", "Abc", "Miu"]
for k in sorted_names:
print("{} : {}".format(k, scores[k]))
# Foo : 10
# Bar : 34
# Abc : 34
# Miu : 88
scores = {
'Foo' : 10,
'Bar' : 34,
'Miu' : 88,
'Abc' : 34,
}
# sort the keys according to the values:
sorted_names = sorted(scores, key=scores.__getitem__)
print(sorted_names) # ["Foo", "Bar", "Miu", "Abc"]
for k in sorted_names:
print("{} : {}".format(k, scores[k]))
# Foo : 10
# Bar : 34
# Abc : 34
# Miu : 88
Sort dictionary keys by value (another example)
- sort
- key
scores = {
"Jane" : 30,
"Joe" : 20,
"George" : 30,
"Hellena" : 90,
}
for name in scores.keys():
print(f"{name:8} {scores[name]}")
print('')
for name in sorted(scores.keys()):
print(f"{name:8} {scores[name]}")
print('')
for val in sorted(scores.values()):
print(f"{val:8}")
print('')
for name in sorted(scores.keys(), key=lambda x: scores[x]):
print(f"{name:8} {scores[name]}")
Jane 30
Joe 20
George 30
Hellena 90
George 30
Hellena 90
Jane 30
Joe 20
20
30
30
90
Joe 20
Jane 30
George 30
Hellena 90
Insertion Order is kept
Since Python 3.7
d = {}
d['a'] = 1
d['b'] = 2
d['c'] = 3
d['d'] = 4
print(d)
{'a': 1, 'b': 2, 'c': 3, 'd': 4}
Change order of keys in dictionary - OrderedDict
- collections
- OrderedDict
from collections import OrderedDict
d = OrderedDict()
d['a'] = 1
d['b'] = 2
d['c'] = 3
d['d'] = 4
print(d)
d.move_to_end('a')
print(d)
d.move_to_end('d', last=False)
print(d)
for key in d.keys():
print(key)
OrderedDict([('a', 1), ('b', 2), ('c', 3), ('d', 4)])
OrderedDict([('b', 2), ('c', 3), ('d', 4), ('a', 1)])
OrderedDict([('d', 4), ('b', 2), ('c', 3), ('a', 1)])
d
b
c
a
Set order of keys in dictionary - OrderedDict
- collections
- OrderedDict
from collections import OrderedDict
d = {}
d['a'] = 1
d['b'] = 2
d['c'] = 3
d['d'] = 4
print(d)
planned_order = ('b', 'c', 'd', 'a')
e = OrderedDict(sorted(d.items(), key=lambda x: planned_order.index(x[0])))
print(e)
print('-----')
# Create index to value mapping dictionary from a list of values
planned_order = ('b', 'c', 'd', 'a')
plan = dict(zip(planned_order, range(len(planned_order))))
print(plan)
f = OrderedDict(sorted(d.items(), key=lambda x: plan[x[0]]))
print(f)
{'a': 1, 'b': 2, 'c': 3, 'd': 4}
OrderedDict([('b', 2), ('c', 3), ('d', 4), ('a', 1)])
-----
{'b': 0, 'c': 1, 'd': 2, 'a': 3}
OrderedDict([('b', 2), ('c', 3), ('d', 4), ('a', 1)])
Setdefault
Trying to access a key in a dictionary that does not exist will result a KeyError exception.
Using the get
method we can avoid this. The get
method, will return the value of the key if the key exists. None if the key does not exists, or a default value if it was supplied to the get
method.
This will not change the dictionary.
Using the setdefault
method is similar to the get
method but it will also create the key with the given value.
grades = {}
# print(grades['python']) # KeyError: 'python'
print(grades.get('python')) # None
print(grades.get('python', 'snake')) # snake
print(grades) # {}
print(grades.setdefault('perl')) # None
print(grades) # {'perl': None}
print(grades.setdefault('python', 'snake')) # 'snake'
print(grades) # {'perl': None, 'python': 'snake'}
print(grades.setdefault('python', 'boa')) # 'snake'
print(grades) # {'perl': None, 'python': 'snake'}
Exercise: count characters
- Write a script called count_characters.py that given a long text will count how many times each character appears.
- Change the code so it will be able to count characters in a file.
text = """
This is a very long text.
OK, maybe it is not that long after all.
"""
Exercise: count words
- Create script called count_words.py
- Skeleton:
words = ['Wombat', 'Rhino', 'Sloth', 'Tarantula', 'Sloth', 'Rhino', 'Sloth']
Expected output: (the order is not important)
Wombat:1
Rhino:2
Sloth:3
Tarantula:1
Exercise: count words from a file
Create a script called count_words_from_a_file.py that given a file with words and spaces and newlines only, count how many times each word appears.
Lorem ipsum dolor qui ad labor ad labor sint dolor tempor incididunt ut labor ad dolore lorem ad
Ut labor ad dolor lorem qui ad ut labor ut ad commodo commodo
Lorem ad dolor in reprehenderit in lorem ut labor ad dolore eu in labor dolor
sint occaecat ad labor proident sint in in qui labor ad dolor ad in ad labor
- Based on Lorem Ipsum
Expected result for the above file:
ad 13
commodo 2
dolor 6
dolore 2
eu 1
in 6
incididunt 1
ipsum 1
labor 10
lorem 5
occaecat 1
proident 1
qui 3
reprehenderit 1
sint 3
tempor 1
ut 5
Exercise: Apache log
Every web server logs the visitors and their requests in a log file. The Apache web server has a log file similar to the following file. (Though I have trimmed the lines for the exercise.) Each line is a "hit", a request from the browser of a visitor.
Each line starts with the IP address of the visitor. e.g. 217.0.22.3.
Create a script called apache_log_parser.py that given sucha a log file from Apache, report how many hits (line were from each IP address.
{% embed include file="src/examples/dictionary/apache_access.log)
Expected output:
127.0.0.1 12
139.12.0.2 2
217.0.22.3 7
Exercise: Combine lists again
See the same exercise in the previous chapter. Use the filename combine_lists_using_dictionary.py.
Exercise: counting DNA bases
Write a script called count_dna_bases.py that given a sequence like this: "ACTNGTGCTYGATRGTAGCYXGTN", will print out the distribution of the elemnts to get the following result:
A 3 - 12.50 %
C 3 - 12.50 %
G 6 - 25.00 %
N 2 - 8.33 %
R 1 - 4.17 %
T 6 - 25.00 %
X 1 - 4.17 %
Y 2 - 8.33 %
Exercise: Count Amino Acids
-
Each sequence consists of many repetition of the 4 bases represented by the ACTG characters.
-
There are 64 codons (sets of 3 bases following each other)
-
There are 20 Amino Acids each of them are represented by 3 bases (by one codon).
-
Some of the Amino Acids can be represented in multiple ways, represented in the Codon Table. For example Histidine can be encoded by both CAU, CAC
-
Create a file called count_amino_acids.py that given a file witha DNA sequence in it, will count the Amino acids from the sequence.
-
Read the sequence saved in a txt file.
-
You can generate a sequence with a random number generator and save it to that file, but it would be much better if you used a real sequence.
-
An even better way would be to read the sequence from a FASTA file. You can download one from NCBI.
-
Skeleton:
codon_table = {
'Phe' : ['TTT', 'TTC'],
'Leu' : ['TTA', 'TTG', 'CTT', 'CTC', 'CTA', 'CTG'],
'Ile' : ['ATT', 'ATC', 'ATA'],
'Met' : ['ATG'],
'Val' : ['GTT', 'GTC', 'GTA', 'GTG'],
'Ser' : ['TCT', 'TCC', 'TCA', 'TCG', 'AGT', 'AGC'],
'Pro' : ['CCT', 'CCC', 'CCA', 'CCG'],
'Thr' : ['ACT', 'ACC', 'ACA', 'ACG'],
'Ala' : ['GCT', 'GCC', 'GCA', 'GCG'],
'Tyr' : ['TAT', 'TAC'],
'His' : ['CAT', 'CAC'],
'Gln' : ['CAA', 'CAG'],
'Asn' : ['AAT', 'AAC'],
'Lys' : ['AAA', 'AAG'],
'Asp' : ['GAT', 'GAC'],
'Glu' : ['GAA', 'GAG'],
'Cys' : ['TGT', 'TGC'],
'Trp' : ['TGG'],
'Arg' : ['CGT', 'CGC', 'CGA', 'CGG', 'AGA', 'AGG'],
'Gly' : ['GGT', 'GGC', 'GGA', 'GGG'],
'STOP' : ['TAA', 'TAG', 'TGA']
}
- You will want to convert this to a dictionary that maps each codon to an Amino Acid. Do it programmatically!
Exercise: List of dictionaries
Given the following file build a list of dictionaries where each dictionary represents one person. The keys in the dictionary are the names of the columns (fname, lname, born) the values are the respective values from each row. Create a file called list_of_dictionaries.py.
fname,lname,born
Graham,Chapman,8 January 1941
Eric,Idle,29 March 1943
Terry,Gilliam,22 November 1940
Terry,Jones,1 February 1942
John,Cleese,27 October 1939
Michael,Palin,5 May 1943
- Skeleton
# ...
print(people[1]['fname'])
Exercise: Dictionary of dictionaries
Given the following file build a dictionary of dictionaries where each internal dictionary represents one person. The keys in the internal dictionaries are the names of the columns (fname, lname, born) the values are the respective values from each row. In the outer dictionary the keys are the (fname, lname) tuples. Create a file called dictionary_of_dictionaries.py
fname,lname,born
Graham,Chapman,8 January 1941
Eric,Idle,29 March 1943
Terry,Gilliam,22 November 1940
Terry,Jones,1 February 1942
John,Cleese,27 October 1939
Michael,Palin,5 May 1943
Skeleton:
# ...
print(people[('Eric', 'Idle')]['born']) # 29 March 1943
Exercise: Age limit with dictionaries
-
Create a file called age_limit_with_dictionary.py
-
Ask the user what is their age and in which country are they located.
-
Tell them if they can legally drink alcohol.
-
See the Legal drinking age list.
-
Given a file like the following create a new file with a third column in which you write "yes", or "no" depending if the person can legally drink alcohol in that country.
Exercise: Merge files with timestamps
- Write a script called merge_files_with_timestamps.py
- Given a few CSV files in which the first column is a timestamp, write a script that can merge the files so the merged result also has timestamps in increasing order.
- First try to solve it for 2 files.
- Then solve it for any N files.
1601009973,1
1601009975,3
1601009976,4
1601009978,6
1601009981,9
1601009982,10
1601009983,11
1601009984,12
1601009987,15
1601009989,17
1601009990,18
1601009991,19
1601009992,20
1601009974,2
1601009977,5
1601009980,8
1601009988,16
1601009979,7
1601009985,13
1601009986,14
Solution: count characters
text = """
This is a very long text.
OK, maybe it is not that long after all.
"""
# print(text)
count = {}
for char in text:
if char == '\n':
continue
if char not in count:
count[char] = 1
else:
count[char] += 1
for key in sorted( count.keys() ):
print("'{}' {}".format(key, count[key]))
- We need to store the counter somewhere. We could use two lists for that, but that would give a complex solution that runs in O(n**2) time.
- Besides, we are in the chapter about dictionaries so probably we better use a dictionary.
- In the
count
dictionary we each key is going to be one of the characters and the respective value will be the number of times it appeared. - So if out string is "aabx" then we'll end up with
{
"a": 2,
"b": 1,
"x": 1,
}
-
The
for in
loop on a string will iterate over it character by charter (even if we don't call our variablechar
. -
We check if the current character is a newline
\n
and if it we callcontinue
to skip the rest of the iteration. We don't want to count newlines. -
Then we check if we have already seen this character. That is, it is already one of the keys in the
count
dictionary. If not yet, then we add it and put 1 as the values. After all we saw one copy of this character. If we have already seen this character (we get to theelse
part) then we increment the counter for this character. -
We are done now with the data collection.
-
In the second loop we go over the keys of the dictionary, that is the characters we have encountered. We sort them in ASCII order.
-
Then we print each one of them and the respective value, the number of times the character was found.
Default Dict
- collections
- defaultdict
counter = {}
word = 'eggplant'
counter[word] += 1
# counter[word] = counter[word] + 1
Traceback (most recent call last):
File "counter.py", line 5, in <module>
counter[word] += 1
KeyError: 'eggplant'
counter = {}
word = 'eggplant'
if word not in counter:
counter[word] = 0
counter[word] += 1
print(counter)
{'eggplant': 1}
from collections import defaultdict
counter = defaultdict(int)
word = 'eggplant'
counter[word] += 1
print(counter)
defaultdict(<class 'int'>, {'eggplant': 1})
Solution: count characters with default dict
- collections
- defaultdict
from collections import defaultdict
text = """
This is a very long text.
OK, maybe it is not that long after all.
"""
# print(text)
count = defaultdict(int)
for char in text:
if char == '\n':
continue
count[char] += 1
for key in sorted( count.keys() ):
print("'{}' {}".format(key, count[key]))
- The previous solution can be slightly improved by using
defaultdict
from thecollections
module. count = defaultdict(int)
creates an empty dictionary that has the special feature that if you try to use a key that does not exists, it pretends that it exists and that it has a value 0.- This allows us to remove the condition checking if the character was already seen and just increment the counter. The first time we encounter a charcter the dictionary will pretend that it was already there with value 0 so everying will work out nicely.
Solution: count words (plain)
words = ['Wombat', 'Rhino', 'Sloth', 'Tarantula', 'Sloth', 'Rhino', 'Sloth']
counter = {}
for word in words:
if word not in counter:
counter[word] = 0
counter[word] += 1
for word in counter:
print("{}:{}".format(word, counter[word]))
Solution: count words (defaultdict)
-
defaultdict
from collections import defaultdict
words = ['Wombat', 'Rhino', 'Sloth', 'Tarantula', 'Sloth', 'Rhino', 'Sloth']
counter = defaultdict(int)
for word in words:
counter[word] += 1
print(counter)
for word in counter.keys():
print("{}:{}".format(word, counter[word]))
Solution: count words (Counter)
-
Counter
from collections import Counter
words = ['Wombat', 'Rhino', 'Sloth', 'Tarantula', 'Sloth', 'Rhino', 'Sloth']
cnt = Counter()
for word in words:
cnt[word] += 1
print(cnt)
for word in cnt.keys():
print("{}:{}".format(word, cnt[word]))
Solution: count words in file
from collections import defaultdict
import sys
filename = 'README'
if len(sys.argv) > 1:
filename = sys.argv[1]
print(filename)
count = defaultdict(int)
with open(filename) as fh:
for full_line in fh:
line = full_line.rstrip('\n')
line = line.lower()
for word in line.split():
if word == '':
continue
count[word] += 1
for word in sorted(count):
print("{:13} {:>2}".format(word, count[word]))
Solution: Apache log
from collections import defaultdict
import sys
filename = 'apache_access.log'
if len(sys.argv) > 1:
filename = sys.argv[1]
count = defaultdict(int)
with open(filename) as fh:
for line in fh:
space = line.index(' ')
ip = line[0:space]
count[ip] += 1
for ip in count:
print("{:16} {:>3}".format(ip, count[ip]))
Solution: Apache log using split
from collections import defaultdict
import sys
filename = 'apache_access.log'
if len(sys.argv) > 1:
filename = sys.argv[1]
count = defaultdict(int)
with open(filename) as fh:
for line in fh:
ip, rest = line.split(' ', 1)
#ip = line.split(' ', 1)[0]
count[ip] += 1
for ip in count:
print("{:16} {:>3}".format(ip, count[ip]))
Solution: Combine files
- This is a working, but very verbose solution. Check out the next one!
c = {}
with open('examples/files/a.txt') as fh:
for line in fh:
k, v = line.rstrip("\n").split("=")
if k in c:
c[k] += int(v)
else:
c[k] = int(v)
with open('examples/files/b.txt') as fh:
for line in fh:
k, v = line.rstrip("\n").split("=")
if k in c:
c[k] += int(v)
else:
c[k] = int(v)
with open('out.txt', 'w') as fh:
for k in sorted(c.keys()):
fh.write("{}={}\n".format(k, c[k]))
Solution: Combine files-improved
from collections import defaultdict
combined = defaultdict(int)
for filename in (['examples/files/a.txt', 'examples/files/b.txt']):
with open(filename) as fh:
for line in fh:
key, value = line.rstrip("\n").split("=")
combined[key] += int(value)
with open('out.txt', 'w') as fh:
for key, value in sorted(combined.items()):
print("{}={}".format(key, value))
fh.write("{}={}\n".format(key, value))
Solution: counting DNA bases
from collections import defaultdict
seq = "ACTNGTGCTYGATRGTAGCYXGTN"
count = defaultdict(int)
for cr in seq:
count[cr] += 1
for cr in sorted(count.keys()):
print("{} {} - {:>5.2f} %".format(cr, count[cr], 100 * count[cr]/len(seq)))
# >5 is the right alignment of 5 places
# .2f is the floating point with 2 digits after the floating point
Solution: Count Amino Acids
Generate random DNA sequence
import sys
import random
if len(sys.argv) != 2:
exit("Need a number")
count = int(sys.argv[1])
dna = []
for _ in range(count):
dna.append(random.choice(['A', 'C', 'T', 'G']))
print(''.join(dna))
dna = 'CACCCATGAGATGTCTTAACGCTGCTTTCATTATAGCCG'
aa_by_codon = {
'ACG' : '?',
'CAC' : 'Histidin',
'CAU' : 'Histidin',
'CCA' : 'Proline',
'CCG' : 'Proline',
'GAT' : '?',
'GTC' : '?',
'TGA' : '?',
'TTA' : '?',
'CTG' : '?',
'CTT' : '?',
'TCA' : '?',
'TAG' : '?',
#...
}
count = {}
for i in range(0, len(dna)-2, 3):
codon = dna[i:i+3]
#print(codon)
aa = aa_by_codon[codon]
if aa not in count:
count[aa] = 0
count[aa] += 1
for aa in sorted(count.keys()):
print("{} {}".format(aa, count[aa]))
seq = input('Type your DNA sequence here: ').upper()
codon_table = {
'Phe' : ['TTT', 'TTC'],
'Leu' : ['TTA', 'TTG', 'CTT', 'CTC', 'CTA', 'CTG'],
'Ile' : ['ATT', 'ATC', 'ATA'],
'Met' : ['ATG'],
'Val' : ['GTT', 'GTC', 'GTA', 'GTG'],
'Ser' : ['TCT', 'TCC', 'TCA', 'TCG', 'AGT', 'AGC'],
'Pro' : ['CCT', 'CCC', 'CCA', 'CCG'],
'Thr' : ['ACT', 'ACC', 'ACA', 'ACG'],
'Ala' : ['GCT', 'GCC', 'GCA', 'GCG'],
'Tyr' : ['TAT', 'TAC'],
'His' : ['CAT', 'CAC'],
'Gln' : ['CAA', 'CAG'],
'Asn' : ['AAT', 'AAC'],
'Lys' : ['AAA', 'AAG'],
'Asp' : ['GAT', 'GAC'],
'Glu' : ['GAA', 'GAG'],
'Cys' : ['TGT', 'TGC'],
'Trp' : ['TGG'],
'Arg' : ['CGT', 'CGC', 'CGA', 'CGG', 'AGA', 'AGG'],
'Gly' : ['GGT', 'GGC', 'GGA', 'GGG'],
'STOP' : ['TAA', 'TAG', 'TGA']
}
amino_acids = []
counter = {}
protein_sequence = []
while seq:
amino_acids.append(seq[:3])
seq = seq[3:]
for codon in amino_acids:
if len(codon) < 3:
print('The remaining bases: {} are not coding for an amino acid'.format(codon))
for aa in codon_table:
if codon in codon_table[aa]:
if aa in counter:
counter[aa] += 1
else:
counter[aa] = 1
protein_sequence.append(aa)
break
print(''.join(protein_sequence))
ordered = sorted(counter.keys())
for aa in ordered:
print('{} {} - {:>5.2f} %'.format(aa, counter[aa], counter[aa]/len(protein_sequence)*100))
Solution: List of dictionaries
import sys
filename = 'examples/csv/monty_python.csv'
if len(sys.argv) == 2:
filename = sys.argv[1]
people = []
with open(filename) as fh:
fh.readline() # skip first row
for line in fh:
line = line.rstrip('\n')
fname, lname, born = line.split(',')
people.append({
'fname': fname,
'lname': lname,
'born': born,
})
print(people[1]['fname'])
import sys
import csv
filename = 'examples/csv/monty_python.csv'
if len(sys.argv) == 2:
filename = sys.argv[1]
people = []
with open(filename) as fh:
reader = csv.DictReader(fh)
for line in reader:
people.append(line)
print(people[1]['fname'])
Solution: Dictionary of dictionaries
import sys
filename = 'examples/csv/monty_python.csv'
if len(sys.argv) == 2:
filename = sys.argv[1]
people = {}
with open(filename) as fh:
fh.readline() # skip first row
for line in fh:
line = line.rstrip('\n')
fname, lname, born = line.split(',')
people[(fname, lname)] = {
'fname': fname,
'lname': lname,
'born': born,
}
print(people[('Eric', 'Idle')]['born'])
import sys
import csv
filename = 'examples/csv/monty_python.csv'
if len(sys.argv) == 2:
filename = sys.argv[1]
people = {}
with open(filename) as fh:
reader = csv.DictReader(fh)
for line in reader:
people[(line['fname'], line['lname'])] = line
print(people[('Eric', 'Idle')]['born'])
Solution: Age limit with dictionaries
legal_drinking_age = {
0 : ['Angola', 'Guinea-Bissau', 'Nigeria', 'Togo', 'Western Sahara', 'Haiti', 'Cambodia', 'Macau'],
15 : ['Central African Republic'],
16 : [
'Gambia',
'Morocco',
'Antigua and Barbuda',
'Barbados',
'British Virgin Islands',
'Cuba',
'Dominica',
'Grenada',
'Saint Lucia',
'Saint Vincent and the Grenadines',
'Palestinian Authority',
'Austria',
'Denmark',
'Germany',
'Gibraltar',
'Lichtenstein',
'Luxembourg',
'San Marino',
'Switzerland'
],
17 : ['Malta'],
19 : ['Canada', 'South Korea'],
20 : ['Benin', 'Paraguay', 'Japan', 'Thailand', 'Uzbekistan', 'Iceland', 'Sweden'],
21 : [
'Cameroon',
'Egypt',
'Equatorial Guinea',
'Bahrain', 'Indonesia',
'Kazakhstan',
'Malaysia',
'Mongolia',
'Oman',
'Qatar',
'Sri Lanka',
'Turkmenistan',
'United Arab Emirates',
'American Samoa',
'Northern Mariana Islands',
'Palau',
'Samoa',
'Solomon Islands'
],
25 : ['USA'],
200 : ['Lybia', 'Somalia', 'Sudan', 'Afghanistan', 'Brunei', 'Iran', 'Iraq', 'Kuwait', 'Pakistan', 'Saudi Arabia', 'Yemen'],
}
age = int(input('Please enter your age in number of years: '))
country = input('Please enter the country of your location: ')
for k in legal_drinking_age:
if country in legal_drinking_age[k]:
print('The minimum legal drinking age in your location is: {} years'.format(k))
if age >= k:
exit('You are allowed to consume alcohol in your location')
else:
exit('You are not permitted to consume alcohol currently in your location.')
print('The minimum legal drinking age in your location is: 18 years')
if age >= 18:
exit('You are allowed to consume alcohol in your location')
else:
exit('You are not permitted to consume alcohol currently in your location.')
Solution: Merge files with timestamps
import sys
file_a = sys.argv[1]
file_b = sys.argv[2]
with open(file_a) as fha:
with open(file_b) as fhb:
line_a = None
line_b = None
while True:
if line_a is None:
line_a = fha.readline()
if line_b is None:
line_b = fhb.readline()
if line_a == '' and line_b == '':
break
if line_a == '':
print(line_b, end='')
line_b = None
continue
if line_b == '':
print(line_a, end='')
line_a = None
continue
time_a = line_a.split(',')[0]
time_b = line_b.split(',')[0]
if int(time_a) < int(time_b):
print(line_a, end='')
line_a = fha.readline()
else:
print(line_b, end='')
line_b = fhb.readline()
import sys
files = sys.argv[1:]
fhs = {}
rows = {}
for filename in files:
try:
fhs[filename] = open(filename)
rows[filename] = None
except Exception:
print("Could not open {filename}")
while True:
files_with_content = []
for filename, fh in fhs.items():
if rows[filename] is None:
rows[filename] = fh.readline()
if rows[filename] != '':
files_with_content.append(filename)
if not files_with_content:
break
sorted_rows = sorted(files_with_content, key=lambda filename: rows[filename].split(',')[0])
smallest = sorted_rows[0]
print(rows[smallest], end='')
rows[smallest] = None
for fh in fhs.values():
fh.close()
Do not change dictionary in loop
user = {
'fname': 'Foo',
'lname': 'Bar',
}
for k in user.keys():
user['email'] = 'foo@bar.com'
print(k)
print('-----')
for k in user:
user['birthdate'] = '1991'
print(k)
# lname
# fname
# -----
# lname
# Traceback (most recent call last):
# File "examples/dictionary/change_in_loop.py", line 13, in <module>
# for k in user:
# RuntimeError: dictionary changed size during iteration
Named tuple (sort of immutable dictionary)
-
namedtuple
-
A bit like an immutable dictionary
from collections import namedtuple
Person = namedtuple('Person', ['name', 'email'])
one = Person(name='Joe', email='joe@example.com')
two = Person(name='Jane', email='jane@example.com')
print(one.name)
print(two.email)
Create dictionary from List
categories_list = ['animals', 'vegetables', 'fruits']
categories_dict = {cat:[] for cat in categories_list}
print(categories_dict)
categories_dict['animals'].append('cat')
print(categories_dict)
{'animals': [], 'vegetables': [], 'fruits': []}
{'animals': ['cat'], 'vegetables': [], 'fruits': []}
Sort Hungarian letters (lookup table)
letters = [
"a", "á", "b", "c", "cs", "d", "dz", "dzs", "e", "é", "f",
"g", "gy", "h", "i", "í", "j", "k", "l", "ly", "m", "n",
"ny", "o", "ó", "ö", "ő", "p", "q", "r", "s", "sz", "t",
"ty", "u", "ú", "ü", "ű", "v", "w", "x", "y", "z", "zs",
]
print(enumerate(letters))
print('-------')
print(list(enumerate(letters)))
print('-------')
print(dict(enumerate(letters)))
print('-------')
#mapping = {v:k for k, v in dict(enumerate(letters)).items()}
mapping = {letter:ix for ix, letter in enumerate(letters)}
print(mapping)
print('------------------')
text = ["cs", "á", "ő", "ú", "e", "dzs", "zs", "a", "ny"]
print(sorted(text))
print('------------------')
print(sorted(text, key=lambda letter: mapping[letter]))
Sets
sets
-
set
-
Sets in Python are used when we are primarily interested in operations that we know from the set theory.
-
See also the Venn diagrams.
-
In day to day speach we often use the word "group" instead of "set" even though they are not the same.
-
What are the common elements of two set (two groups).
-
Is one group (set) the subset of the other?
-
What are all the elements that exist in both groups (sets)?
-
What are the elements that exist in exactly one of the groups (sets)?
set operations
-
set
-
issubset
-
intersection
-
symmetric_difference
-
set
-
issubset
-
intersection
-
symmetric difference
-
union
-
relative complement (difference)
Creating a set
things = {'table', 'chair', 'door', 'chair'}
print(things)
print(type(things))
if 'table' in things:
print("has table")
Output:
{'door', 'chair', 'table'}
<class 'set'>
has table
Creating a set from a list
furniture = ['table', 'chair', 'door', 'chair', 'chair']
things = set(furniture)
print(things)
print(type(things))
if 'table' in things:
print("has table")
Output:
{'table', 'chair', 'door'}
<class 'set'>
has table
Converting set to list
planets = {'Mars', 'Jupiter', 'Saturn', 'Mercury', 'Venus', 'Earth', 'Mars'}
print(planets)
planets_list = list(planets)
print(planets_list)
Output:
{'Jupiter', 'Mars', 'Earth', 'Saturn', 'Venus', 'Mercury'}
['Jupiter', 'Mars', 'Earth', 'Saturn', 'Venus', 'Mercury']
Creating an empty set
objects = set()
print(objects)
print(type(objects))
other = {}
print(other)
print(type(other)) # This is an empty dict and not a set!!!!
Output:
set()
<class 'set'>
{}
<class 'dict'>
Adding an element to a set (add)
objects = set()
print(objects)
objects.add('Mars')
print(objects)
objects.add('Mars')
print(objects)
objects.add('Neptun')
print(objects)
Output:
set()
{'Mars'}
{'Mars'}
{'Neptun', 'Mars'}
Merging one set into another set (update)
objects = set(['Mars', 'Jupiter', 'Saturn'])
internal = set(['Mercury', 'Venus', 'Earth', 'Mars'])
objects.update(internal)
print(objects)
print(internal)
Output:
{'Mars', 'Earth', 'Jupiter', 'Saturn', 'Mercury', 'Venus'}
{'Earth', 'Mars', 'Mercury', 'Venus'}
set intersection
- set
- intersection
english = set(['door', 'car', 'lunar', 'era'])
spanish = set(['era', 'lunar', 'hola'])
print('english: ', english)
print('spanish: ', spanish)
both = english.intersection(spanish)
print(both)
intersection
returns the elements that are in both sets.
Output:
english: {'car', 'lunar', 'era', 'door'}
spanish: {'lunar', 'era', 'hola'}
{'lunar', 'era'}
set subset
- set
- issubset
english = set(['door', 'car', 'lunar', 'era'])
spanish = set(['era', 'lunar', 'hola'])
words = set(['door', 'lunar'])
print('issubset: ', words.issubset( english ))
print('issubset: ', words.issubset( spanish ))
Output:
issubset: True
issubset: False
set symmetric difference
- set
- symmetric_difference
english = set(['door', 'car', 'lunar', 'era'])
spanish = set(['era', 'lunar', 'hola'])
diff = english.symmetric_difference(spanish)
print('symmetric_difference: ', diff)
- Symmetric difference contains all the elements in either one of the sets, but not in both. "the ears of the elephant".
Output:
symmetric_difference: {'door', 'hola', 'car'}
set union
- set
- union
english = set(['door', 'car', 'lunar', 'era'])
spanish = set(['era', 'lunar', 'hola'])
all_the_words = english.union(spanish)
print(english)
print(spanish)
print(all_the_words)
# x = english + spanish # TypeError: unsupported operand type(s) for +: 'set' and 'set'
Output:
{'era', 'door', 'lunar', 'car'}
{'era', 'hola', 'lunar'}
{'era', 'door', 'car', 'hola', 'lunar'}
set relative complement (difference)
english = set(['door', 'car', 'lunar', 'era'])
spanish = set(['era', 'lunar', 'hola'])
print(spanish.difference(english))
print(english.difference(spanish))
print()
eng = english - spanish
spa = spanish - english
print(spa)
print(eng)
print()
print(english)
print(spanish)
Output:
{'hola'}
{'door', 'car'}
{'hola'}
{'door', 'car'}
{'door', 'car', 'era', 'lunar'}
{'lunar', 'era', 'hola'}
Set of numbers
numbers = {2, 3}
print(numbers)
Output:
{2, 3}
Set of lists
lists = set([ [2, 3], [1, 2] ])
Output:
Traceback (most recent call last):
File "/home/gabor/work/slides/python/examples/sets/set_of_lists.py", line 1, in <module>
lists = set([ [2, 3], [1, 2] ])
TypeError: unhashable type: 'list'
Set of tuples
tuples = set([ (2, 3), (1, 2) ])
print(tuples)
print(type(tuples))
Output:
{(2, 3), (1, 2)}
<class 'set'>
Create set from List
categories_list = ['animals', 'vegetables', 'fruits']
categories_set = {cat:set() for cat in categories_list}
print(categories_set)
categories_set['animals'].add('cat')
print(categories_set)
Output:
{'animals': set(), 'vegetables': set(), 'fruits': set()}
{'animals': {'cat'}, 'vegetables': set(), 'fruits': set()}
Code Reuse
Permutations
import sys
if len(sys.argv) != 2:
exit(f"Usage: {sys.argv[0]} n")
'''
n!
'''
n = int(sys.argv[1])
n_fact = 1
for i in range(1, n+1):
n_fact *= i
print(n_fact)
k-Permutations
import sys
if len(sys.argv) != 3:
exit(f"Usage: {sys.argv[0]} n r")
'''
n!
P(n, r) = -----
(n-r)!
'''
n = int(sys.argv[1])
r = int(sys.argv[2])
n_fact = 1
for i in range(1, n+1):
n_fact *= i
#print(n_fact)
n_r_fact = 1
for i in range(1, n-r+1):
n_r_fact *= i
#print(n_r_fact)
P = n_fact // n_r_fact
print(P)
Binomial coefficient
import sys
if len(sys.argv) != 3:
exit(f"Usage: {sys.argv[0]} n k")
'''
n n!
- = ---------
k k!*(n-k)!
'''
n = int(sys.argv[1])
k = int(sys.argv[2])
n_fact = 1
for i in range(1, n+1):
n_fact *= i
print(n_fact)
n_k_fact = 1
for i in range(1, n-k+1):
n_k_fact *= i
print(n_k_fact)
k_fact = 1
for i in range(1, k+1):
k_fact *= i
print(k_fact)
bc = n_fact // (k_fact * n_k_fact)
print(bc)
Binomial coefficient - factorial function
import sys
if len(sys.argv) != 3:
exit(f"Usage: {sys.argv[0]} n k")
'''
n n!
- = ---------
k k!*(n-k)!
'''
def fact(x):
x_fact = 1
for i in range(1, x+1):
x_fact *= i
return x_fact
n = int(sys.argv[1])
k = int(sys.argv[2])
n_fact = fact(n)
print(n_fact)
n_k_fact = fact(n-k)
print(n_k_fact)
k_fact = fact(k)
print(k_fact)
bc = n_fact // (k_fact * n_k_fact)
print(bc)
k-Permutations - factorial function
import sys
if len(sys.argv) != 3:
exit(f"Usage: {sys.argv[0]} n r")
'''
n!
P(n, r) = -----
(n-r)!
'''
def fact(x):
x_fact = 1
for i in range(1, x+1):
x_fact *= i
return x_fact
n = int(sys.argv[1])
r = int(sys.argv[2])
n_fact = fact(n)
#print(n_fact)
n_r_fact = fact(n-r)
#print(n_r_fact)
P = n_fact // n_r_fact
print(P)
Permutations - factorial funcion
import sys
if len(sys.argv) != 2:
exit(f"Usage: {sys.argv[0]} n")
'''
n!
'''
n = int(sys.argv[1])
def fact(x):
x_fact = 1
for i in range(1, x+1):
x_fact *= i
return x_fact
n_fact = fact(n)
print(n_fact)
mymath module
def fact(x):
x_fact = 1
for i in range(1, x+1):
x_fact *= i
return x_fact
Permutations - module
import sys
from mymath import fact
if len(sys.argv) != 2:
exit(f"Usage: {sys.argv[0]} n")
'''
n!
'''
n = int(sys.argv[1])
n_fact = fact(n)
print(n_fact)
k-Permutations - module
import sys
from mymath import fact
if len(sys.argv) != 3:
exit(f"Usage: {sys.argv[0]} n r")
'''
n!
P(n, r) = -----
(n-r)!
'''
n = int(sys.argv[1])
r = int(sys.argv[2])
n_fact = fact(n)
#print(n_fact)
n_r_fact = fact(n-r)
#print(n_r_fact)
P = n_fact // n_r_fact
print(P)
Binomial coefficient - module
import sys
from mymath import fact
if len(sys.argv) != 3:
exit(f"Usage: {sys.argv[0]} n k")
'''
n n!
- = ---------
k k!*(n-k)!
'''
n = int(sys.argv[1])
k = int(sys.argv[2])
n_fact = fact(n)
print(n_fact)
n_k_fact = fact(n-k)
print(n_k_fact)
k_fact = fact(k)
print(k_fact)
bc = n_fact // (k_fact * n_k_fact)
print(bc)
Functions (subroutines)
Why use functions?
There are two main reasons to use functions.
One of the is code reuse. Instead of copy-paste-ing snippets of code that does the same in multiple areas in the application, we can create a function with a single copy of the code and call it from multiple location.
Having functions can also make the code easier to understand, easier to test and to maintain.
The functions are supposed to be relatively short, each function dealing with one issue, with one concern. They should have well defined input and output and without causing side-effects.
There are no clear rules, but the suggestion is that function be somewhere between 4-30 lines of code.
- Code reuse - DRY - Don't Repeate Yourself
- Small units of code. (One thought, single responsibility) Easier to understand, test, and maintain.
Defining simple function
- def
- return
def add(x, y):
z = x + y
return z
a = add(2, 3)
print(a) # 5
q = add(23, 19)
print(q) # 42
The function definition starts with the word "dev" followed by the name of the function ("add" in our example), followed by the list of parameters in a pair of parentheses, followed by a colon ":". Then the body of the function is indented to the right. The depth of indentation does not matter but it must be the same for all the lines of the function. When we stop the indentation and start a new expression on the first column, that's what tells Python that the function defintion has ended.
Passing positional parameters to a function
- def
def sendmail(From, To, Subject, Content):
print('From:', From)
print('To:', To)
print('Subject:', Subject)
print('')
print(Content)
sendmail('gabor@szabgab.com',
'szabgab@gmail.com',
'self message',
'Has some content too')
Positional parameters.
Function parameters can be named
- named parameter
- keyword argument
def sendmail(From, To, Subject, Content):
print('From:', From)
print('To:', To)
print('Subject:', Subject)
print('')
print(Content)
sendmail(
Subject = 'self message',
Content = 'Has some content too',
From = 'gabor@szabgab.com',
To = 'szabgab@gmail.com',
)
The parameters of every function can be passed either as positional parameters or as named parameters.
Mixing positional and named parameters
We have already seen several built-in functions where we mixed positional arguments with some key-value arguments.
fname = "Foo"
lname = "Bar"
animals = ["snake", "mouse", "cat", "dog"]
print(fname, lname, sep="-", end="\n\n")
by_length = sorted(animals, key=len, reverse=True)
print(by_length)
Output:
Foo-Bar
['snake', 'mouse', 'cat', 'dog']
Mixing positional and named parameters - order
We can also mix the parameters passed to any user-defined function, but we have to make sure that positional parameters always come first and named (key-value) parameter come at the end of the parameter list.
def sendmail(From, To, Subject, Content):
print('From:', From)
print('To:', To)
print('Subject:', Subject)
print('')
print(Content)
sendmail(
Subject = 'self message',
Content = 'Has some content too',
To = 'szabgab@gmail.com',
'gabor@szabgab.com',
)
File "examples/functions/named_and_positional_params.py", line 14
'gabor@szabgab.com',
^
SyntaxError: positional argument follows keyword argument
def sendmail(From, To, Subject, Content):
print('From:', From)
print('To:', To)
print('Subject:', Subject)
print('')
print(Content)
sendmail(
'gabor@szabgab.com',
Subject = 'self message',
Content = 'Has some content too',
To = 'szabgab@gmail.com',
)
Default values, optional parameters, optional parameters
def prompt(question, retry=3):
print(question)
print(retry)
#while retry > 0:
# inp = input('{} ({}): '.format(question, retry))
# if inp == 'my secret':
# return True
# retry -= 1
#return False
prompt("Type in your password")
prompt("Type in your secret", 1)
prompt("Hello", retry=7)
# prompt(retry=7, "Hello") # SyntaxError: positional argument follows keyword argument
prompt(retry=42, question="Is it you?")
Output:
Type in your password
3
Type in your secret
1
Hello
7
Is it you?
42
Function parameters can have default values. In such case the parameters are optional. In the function declaration, the parameters with the default values must come last. In the call, the order among these arguments does not matter, and they are optional anyway.
Default value in first param
def add(x=2, y):
print("OK")
Output:
File "default_first.py", line 2
def add(x=2, y):
^
SyntaxError: non-default argument follows default argument
Several defaults, using names
- non-keyword arg after keyword arg
Parameters with defaults must come at the end of the parameter declaration.
def f(a, b=2, c=3):
print(a, b , c)
f(1) # 1 2 3
f(1, b=0) # 1 0 3
f(1, c=0) # 1 2 0
f(1, c=0, b=5) # 1 5 0
# f(b=0, 1)
# would generate:
# SyntaxError: non-keyword arg after keyword arg
f(b=0, a=1) # 1 0 3
def f(a=2, b):
print(a)
print(b)
Output:
File "examples/functions/named_and_positional_bad.py", line 2
def f(a=2, b):
^
SyntaxError: non-default argument follows default argument
There can be several parameters with default values. They are all optional and can be given in any order after the positional arguments.
Default list
# don't use complex data structures as default values
def extend_and_print(names = []):
names.append("cat")
print(names)
extend_and_print()
extend_and_print()
print()
def fixed(names = None):
if names is None:
names = []
names.append("dog")
print(names)
fixed()
fixed()
Output:
['cat']
['cat', 'cat']
['dog']
['dog']
Arbitrary number of arguments *
*args
- tuple
The values arrive as tuple
.
def mysum(*numbers):
print(numbers)
print(type(numbers))
total = 0
for s in numbers:
total += s
return total
from mysum import mysum
print(mysum())
print(mysum(1))
print(mysum(1, 2))
print(mysum(1, 1, 1))
x = 2
y = 7
z = 9
print(mysum(x, y, z))
Output:
()
<class 'tuple'>
0
(1,)
<class 'tuple'>
1
(1, 2)
<class 'tuple'>
3
(1, 1, 1)
<class 'tuple'>
3
(2, 3, 5, 6)
<class 'tuple'>
16
Arbitrary number of arguments passing a lists
from mysum import mysum
x = [2, 3, 5, 6]
mysum(x)
Output:
([2, 3, 5, 6],)
<class 'tuple'>
Traceback (most recent call last):
File "/home/gabor/work/slides/python/examples/functions/sum_of_list.py", line 5, in <module>
mysum(x)
File "/home/gabor/work/slides/python/examples/functions/mysum.py", line 6, in mysum
total += s
TypeError: unsupported operand type(s) for +=: 'int' and 'list'
from mysum import mysum
x = [2, 3, 5, 6]
print(mysum(*x))
Output:
(2, 3, 5, 6)
<class 'tuple'>
16
Arbitrary number of arguments passing a tuple
from mysum import mysum
z = (2, 3, 5, 6)
mysum(z)
Output:
((2, 3, 5, 6),)
<class 'tuple'>
Traceback (most recent call last):
File "/home/gabor/work/slides/python/examples/functions/sum_of_tuple.py", line 5, in <module>
mysum(z)
File "/home/gabor/work/slides/python/examples/functions/mysum.py", line 6, in mysum
total += s
TypeError: unsupported operand type(s) for +=: 'int' and 'tuple'
from mysum import mysum
z = (2, 3, 5, 6)
print(mysum(*z))
Output:
(2, 3, 5, 6)
<class 'tuple'>
16
Fixed parmeters before the others
The *numbers
argument can be preceded by any number of regular arguments
def mysum(op, *numbers):
print(numbers)
if op == '+':
total = 0
elif op == '*':
total = 1
else:
raise Exception('invalid operator {}'.format(op))
for s in numbers:
if op == '+':
total += s
elif op == '*':
total *= s
return total
print(mysum('+', 1))
print(mysum('+', 1, 2))
print(mysum('+', 1, 1, 1))
print(mysum('*', 1, 1, 1))
Output:
(1,)
1
(1, 2)
3
(1, 1, 1)
3
(1, 1, 1)
1
Pass arbitrary number of functions
- As an advanced example we could even pass an arbitrary number of functions
def run_these(value, *functions):
print(functions)
for func in functions:
print(func(value))
run_these("abc", len, lambda x: x+x, lambda y: f"text: {y}")
Output:
(<built-in function len>, <function <lambda> at 0x7fcb4e8bedc0>, <function <lambda> at 0x7fcb4e8bee50>)
3
abcabc
text: abc
Arbitrary key-value pairs in parameters **
- **kwargs
def f(**kw):
print(kw)
f(a=23, b=12)
f(x=11, y=99, z=1)
Output:
{'a': 23, 'b': 12}
{'x': 11, 'y': 99, 'z': 1}
Pass a real dictionary
def func(**kw):
print(kw)
func(a = 23,
b = 19,)
z = {
'c': 10,
'd': 20,
}
func(z = z)
func(**z)
Output:
{'a': 23, 'b': 19}
{'z': {'c': 10, 'd': 20}}
{'c': 10, 'd': 20}
The dictionary contains copy
def f(**kw):
print(kw)
kw['a'] = 7
print(kw)
z = 23
f(a=10, b=12)
f(a=z, y=99, z=1)
print(z)
Output:
{'a': 10, 'b': 12}
{'a': 7, 'b': 12}
{'a': 23, 'y': 99, 'z': 1}
{'a': 7, 'y': 99, 'z': 1}
23
The dictionary contains copy but NOT deep copy!
def f(**kw):
print(kw)
print(hex(id(kw['z'])))
kw['z']['a'] = 7
z = {'a': 1, 'b': 2}
print(z)
print(hex(id(z)))
f(z = z)
print(z)
Output:
{'a': 1, 'b': 2}
0x7f01fd163180
{'z': {'a': 1, 'b': 2}}
0x7f01fd163180
{'a': 7, 'b': 2}
Extra key-value pairs in parameters
**kwargs
def f(name, **kw):
print(name)
print(kw)
f(name="Foo", a=23, b=12)
f(a=23, name="Bar", b=12)
Output:
Foo
{'a': 23, 'b': 12}
Bar
{'a': 23, 'b': 12}
Extra key-value pairs in parameters for email
def sendmail(From, To, Subject, Content, **header):
print('From:', From)
print('To:', To)
print('Subject:', Subject)
for field, value in header.items():
print(f"X-{field}: {value}")
print('')
print(Content)
sendmail(
Subject = 'self message',
Content = 'Has some content too',
From = 'gabor@szabgab.com',
To = 'szabgab@gmail.com',
mailer = "Python",
signature = "My sig",
)
Output:
From: gabor@szabgab.com
To: szabgab@gmail.com
Subject: self message
X-mailer: Python
X-signature: My sig
Has some content too
Every parameter option
def f(op, count=0, *things, **kw):
print(op)
print(count)
print(things)
print(kw)
f(2, 3, 4, 5, a=23, b=12)
Output:
2
3
(4, 5)
{'a': 23, 'b': 12}
Duplicate declaration of functions (multiple signatures)
def add(x, y):
return x*y
print(add(2, 3)) # 6
def add(x):
return x+x
print(add(2)) # 4
add(2, 3)
# TypeError: add() takes exactly 1 argument (2 given)
Output:
4
Traceback (most recent call last):
File "examples/functions/duplicate_add.py", line 9, in <module>
add(2, 3)
TypeError: add() takes 1 positional argument but 2 were given
The second declaration silently overrides the first declaration.
Pylint duplicate declaration
- pylint can find such problems, along with a bunch of others.
pylint -E duplicate_add.py
Output:
************* Module duplicate_add
examples/functions/duplicate_add.py:4:0: E0102: function already defined line 1 (function-redefined)
examples/functions/duplicate_add.py:9:0: E1121: Too many positional arguments for function call (too-many-function-args)
Return more than one value
def calc(x, y):
a = x+y
b = x*y
return a, b
t = calc(4, 5)
print(t)
print(type(t))
z, q = calc(2, 3)
print(z)
print(q)
Output:
(9, 20)
<class 'tuple'>
5
6
Recursive factorial
n! = n * (n-1) ... * 1
0! = 1
n! = n * (n-1)!
f(0) = 1
f(n) = n * f(n-1)
def f(n):
if int(n) != n or n < 0:
raise ValueError("Bad parameter")
if n == 0:
return 1
return n * f(n-1)
print(f(1)) # 1
print(f(2)) # 2
print(f(3)) # 6
print(f(4)) # 24
f(-1)
Recursive Fibonacci
fib(1) = 1
fib(2) = 1
fib(n) = fib(n-1) + fib(n-2)
def fib(n):
if int(n) != n or n <= 0:
raise ValueError("Bad parameter")
if n == 1:
return 1
if n == 2:
return 1
return fib(n-1) + fib(n-2)
print(3, fib(3)) # 2
print(30, fib(30)) # 832040
fib(0.5)
Python also supports recursive functions.
Non-recursive Fibonacci
def fib(n):
if n == 1:
return [1]
if n == 2:
return [1, 1]
fibs = [1, 1]
for _ in range(2, n):
fibs.append(fibs[-1] + fibs[-2])
return fibs
print(fib(1)) # [1]
print(fib(2)) # [1, 1]
print(fib(3)) # [1, 1, 2]
print(fib(10)) # [1, 1, 2, 3, 5, 8, 13, 21, 34, 55]
Unbound recursion
- In order to protect us from unlimited recursion, Python limits the depth of recursion:
def recursion(n):
print(f"In recursion {n}")
recursion(n+1)
recursion(1)
Output:
...
In recursion 995
In recursion 996
Traceback (most recent call last):
File "recursion.py", line 7, in <module>
recursion(1)
File "recursion.py", line 5, in recursion
recursion(n+1)
File "recursion.py", line 5, in recursion
recursion(n+1)
File "recursion.py", line 5, in recursion
recursion(n+1)
[Previous line repeated 992 more times]
File "recursion.py", line 4, in recursion
print(f"In recursion {n}")
RecursionError: maximum recursion depth exceeded while calling a Python object
Set recurions limit
import sys
print(sys.getrecursionlimit())
sys.setrecursionlimit(10)
def recursion(n):
print(f"In recursion {n}")
recursion(n+1)
recursion(1)
Output:
1000
In recursion 1
In recursion 2
In recursion 3
In recursion 4
In recursion 5
In recursion 6
In recursion 7
Traceback (most recent call last):
File "/home/gabor/work/slides/python/examples/functions/recursion_set_limit.py", line 10, in <module>
recursion(1)
File "/home/gabor/work/slides/python/examples/functions/recursion_set_limit.py", line 8, in recursion
recursion(n+1)
File "/home/gabor/work/slides/python/examples/functions/recursion_set_limit.py", line 8, in recursion
recursion(n+1)
File "/home/gabor/work/slides/python/examples/functions/recursion_set_limit.py", line 8, in recursion
recursion(n+1)
[Previous line repeated 4 more times]
File "/home/gabor/work/slides/python/examples/functions/recursion_set_limit.py", line 7, in recursion
print(f"In recursion {n}")
RecursionError: maximum recursion depth exceeded while calling a Python object
Variable assignment and change - Immutable
Details showed on the next slide
a = 42 # number or string
b = a # This is a copy
print(a) # 42
print(b) # 42
a = 1
print(a) # 1
print(b) # 42
a = (1, 2) # tuple
b = a # this is a copy
print(a) # (1, 2)
print(b) # (1, 2)
# a[0] = 42 TypeError: 'tuple' object does not support item assignment
a = (3, 4, 5)
print(a) # (3, 4, 5)
print(b) # (1, 2)
Variable assignment and change - Mutable list
b = [5, 6]
a = b # this is a copy of the *reference* only
# if we change the list in a, it will
# change the list connected to b as well
print(a) # [5, 6]
print(b) # [5, 6]
a[0] = 1
print(a) # [1, 6]
print(b) # [1, 6]
a = [7, 8] # replace the whole list
print(a) # [7, 8]
print(b) # [1, 6]
Variable assignment and change - Mutabled dict
b = {'name' : 'Foo'}
a = b # this is a copy of the *reference* only
# if we change the dictionary in a, it will
# change the dictionary connected to b as well
print(a) # {'name' : 'Foo'}
print(b) # {'name' : 'Foo'}
a['name'] = 'Jar Jar'
print(a) # {'name' : 'Jar Jar'}
print(b) # {'name' : 'Jar Jar'}
# replace reference
a = {'name': 'Foo Bar'}
print(a) # {'name': 'Foo Bar'}
print(b) # {'name': 'Jar Jar'}
Parameter passing of functions
x = 3
def inc(n):
n += 1
return n
print(x) # 3
print(inc(x)) # 4
print(x) # 3
Passing references
numbers = [1, 2, 3]
def update(x):
x[0] = 23
def change(y):
y = [5, 6]
return y
def replace_content(z):
z[:] = [7, 8]
return z
print(numbers) # [1, 2, 3]
update(numbers)
print(numbers) # [23, 2, 3]
print(change(numbers)) # [5, 6]
print(numbers) # [23, 2, 3]
print(replace_content(numbers)) # [7, 8]
print(numbers) # [7, 8]
Function documentation
def f(name):
"""
The documentation
should have more than one lines.
"""
print(name)
f("hello")
print(f.__doc__)
Immediately after the definition of the function, you can add a string - it can be a """ string to spread multiple lines - that will include the documentation of the function. This string can be accessed via the doc (2+2 underscores) attribute of the function. Also, if you 'import' the file - as a module - in the interactive prompt of Python, you will be able to read this documentation via the help() function. help(mydocs) or help(mydocs.f) in the above case.
Sum ARGV
import sys
def mysum(*numbers):
print(numbers)
total = 0
for s in numbers:
total += s
return total
v = [int(x) for x in sys.argv[1:] ]
r = mysum( *v )
print(r)
Copy-paste code
a = [2, 3, 93, 18]
b = [27, 81, 11, 35]
c = [32, 105, 1]
total_a = 0
for v in a:
total_a += v
print("sum of a: {} average of a: {}".format(total_a, total_a / len(a)))
total_b = 0
for v in b:
total_b += v
print("sum of b: {} average of b: {}".format(total_b, total_b / len(b)))
total_c = 0
for v in c:
total_c += v
print("sum of c: {} average of c: {}".format(total_c, total_c / len(a)))
sum of a: 116 average of a: 29.0
sum of b: 154 average of b: 38.5
sum of c: 138 average of c: 34.5
Did you notice the bug?
Copy-paste code fixed
a = [2, 3, 93, 18]
b = [27, 81, 11, 35]
c = [32, 105, 1]
def calc(numbers):
total = 0
for v in numbers:
total += v
return total, total / len(numbers)
total_a, avg_a = calc(a)
print("sum of a: {} average of a: {}".format(total_a, avg_a))
total_b, avg_b = calc(b)
print("sum of b: {} average of b: {}".format(total_b, avg_b))
total_c, avg_c = calc(c)
print("sum of c: {} average of c: {}".format(total_c, avg_c))
sum of a: 116 average of a: 29.0
sum of b: 154 average of b: 38.5
sum of c: 138 average of c: 46.0
Copy-paste code further improvement
data = {
'a': [2, 3, 93, 18],
'b': [27, 81, 11, 35],
'c': [32, 105, 1],
}
def calc(numbers):
total = 0
for v in numbers:
total += v
return total, total / len(numbers)
total = {}
avg = {}
for name, numbers in data.items():
total[name], avg[name] = calc(numbers)
print("sum of {}: {} average of {}: {}".format(name, total[name], name, avg[name]))
Palindrome
An iterative and a recursive solution
def is_palindrome(s):
if s == '':
return True
if s[0] == s[-1]:
return is_palindrome(s[1:-1])
return False
def iter_palindrome(s):
for i in range(0, int(len(s) / 2)):
if s[i] != s[-(i+1)]:
return False
return True
print(is_palindrome('')) # True
print(is_palindrome('a')) # True
print(is_palindrome('ab')) # False
print(is_palindrome('aa')) # True
print(is_palindrome('aba')) # True
print(is_palindrome('abc')) # False
print()
print(iter_palindrome('')) # True
print(iter_palindrome('a')) # True
print(iter_palindrome('ab')) # False
print(iter_palindrome('aa')) # True
print(iter_palindrome('aba')) # True
print(iter_palindrome('abc')) # False
Exit vs return vs break and continue
-
exit
-
return
-
break
-
continue
-
exit will stop your program no matter where you call it.
-
return will return from a function (it will stop the specific function only)
-
break will stop the current "while" or "for" loop
-
continue will stop the current iteration of the current "while" or "for" loop
Exercise: statistics
Create a file called statistics.py that has a function that will accept any number of numbers and return a list of values:
- The sum
- Average
- Minimum
- Maximum
Exercise: Pascal's triangle
- Create a file called pascal_triangle.py that given a number N on the command line will print the first N rows of the Pascal's triangle.
Exercise: Pascal's triangle functions
-
Create a file called pascal_triangle_functions.py that will do exactly as the previous one, but this time make sure you have these functions:
-
A function that given a list of numbers (a row from the triangle, e.g. 1, 3, 3, 1) will return the next row (1, 4, 6, 4, 1). get_next_row
-
A function that given a depth N will return a list of the first N rows. get_triangle
-
A function that will print the triangle. print_triangle.
Exercise: recursive dependency tree
- Create a file called recursive_dependency_tree.py
Give a bunch of files that has list of requirement in them. Process them recursively and print the resulting full list of requirements
b
c
d
e
d
f
g
$ python traversing_dependency_tree.py a
Processing a
Processing b
Processing e
Processing d
Processing c
Processing f
Processing g
Processing d
Exercise: dependency tree
- Create a file called dependency_tree.py
That will process the files holding the dependency tree, but without recursive calls.
Exercise: Tower of Hanoi
- Create a script called tower_of_hanoi.py providing a solution to Tower of Hanoi
There are 3 sticks. On the first stick there are n rings of different sizes. The smaller the ring the higher it is on the stick. Move over all the rings to the 3rd stick by always moving only one ring and making sure that never will there be a large ring on top of a smaller ring.
Exercise: Merge and Bubble sort
- Implement bubble sort call it bubble_sort.py
- Implement merge sort call it merge_sort.py
Exercise: Refactor previous solutions to use functions
- Go over all of the previous exercises and their solutions (e.g. the games)
- Take one (or more if you like this exercise) and change them to use functions.
- If possible make sure you don't have any variable definitions outside of the functions and that each function has a single job to do.
- For each case use the same filename just add at the end: with_functions.py
Exercise: Number guessing - functions
Take the number guessing game from the earlier chapter and move the internal while() loop to a function.
Solution: statistics
def stats(*numbers):
total = 0
average = None # there might be better solutions here!
minx = None
maxx = None
for val in numbers:
total += val
if minx == None:
minx = maxx = val
if minx > val:
minx = val
if maxx < val:
maxx = val
if len(numbers):
average = total / len(numbers)
return total, average, minx, maxx
ttl, avr, smallest, largest = stats(3, 5, 4)
print(ttl)
print(avr)
print(smallest)
print(largest)
Solution: Pascal triangle
import sys
if len(sys.argv) != 2:
exit(f"Usage: {sys.argv[0]} N")
rows = int(sys.argv[1])
row = []
for current in range(0, rows):
if row == []:
next_row = [1]
else:
next_row = []
temp_row = [0] + row + [0]
for ix in range(len(temp_row)-1):
next_row.append(temp_row[ix]+temp_row[ix+1])
row = next_row
print(row)
Solution: Pascal triangle functions
import sys
if len(sys.argv) != 2:
exit(f"Usage: {sys.argv[0]} N")
def get_next_row(row):
if row == []:
next_row = [1]
else:
next_row = []
temp_row = [0] + row + [0]
for ix in range(len(temp_row)-1):
next_row.append(temp_row[ix]+temp_row[ix+1])
return next_row
def get_triangle(rows):
triangle = []
row = []
for current in range(0, rows):
row = get_next_row(row)
triangle.append(row)
return triangle
def print_triangle(triangle):
for row in triangle:
print(row)
triangle = get_triangle(int(sys.argv[1]))
print_triangle(triangle)
Solution: recursive
import sys
import os
if len(sys.argv) < 2:
exit("Usage: {} NAME".format(sys.argv[0]))
start = sys.argv[1]
def get_dependencies(name):
print("Processing {}".format(name))
deps = set(name)
filename = name + ".txt"
if not os.path.exists(filename):
return deps
with open(filename) as fh:
for line in fh:
row = line.rstrip("\n")
deps.add(row)
deps.update( get_dependencies(row) )
return deps
dependencies = get_dependencies(start)
print(dependencies)
Solution: Tower of Hanoi
def check():
for loc in hanoi.keys():
if hanoi[loc] != sorted(hanoi[loc], reverse=True):
raise Exception(f"Incorrect order in {loc}: {hanoi[loc]}")
def move(depth, source, target, helper):
if depth > 0:
move(depth-1, source, helper, target)
val = hanoi[source].pop()
hanoi[target].append(val)
print(f"Move {val} from {source} to {target} Status A:{str(hanoi['A']):10} B:{str(hanoi['B']):10} C:{str(hanoi['C']):10}")
check()
move(depth-1, helper, target, source)
check()
hanoi = {
'A': [4, 3, 2, 1],
'B': [],
'C': [],
}
check()
move(len(hanoi['A']), 'A', 'C', 'B')
check()
def check():
for loc in ['A', 'B', 'C']:
print(f"{loc} {hanoi[loc]}", end=' ')
if hanoi[loc] != sorted(hanoi[loc], reverse=True):
raise Exception(f"Incorrect order in {loc}: {hanoi[loc]}")
print('')
def move(source, target, helper):
#if not hanoi[source]:
# return
if len(hanoi[source]) == 1:
disk = hanoi[source].pop()
print(f"Move {disk} from {source} to {target}")
hanoi[target].append(disk)
return
big_disk = hanoi[source].pop(0) # pretend the biggest disk is not there
move(source, helper, target)
print(f"Move {big_disk} from {source} to {target}")
move(helper, target, source)
hanoi[target].insert(0, big_disk) # stop pretending
check()
hanoi = {
'A': [4, 3, 2, 1],
'B': [],
'C': [],
}
check()
move('A', 'C', 'B')
check()
Solution: Merge and Bubble sort
def bubble_sort(*values):
values = list(values)
for ix in range(len(values)-1):
for jx in range(len(values)-1-ix):
if values[jx] > values[jx+1]:
values[jx], values[jx+1] = values[jx+1], values[jx]
return values
print(bubble_sort(1, 2, 3))
print(bubble_sort(3, 2, 1))
print(bubble_sort(10, 9, 8, 7, 6, 5, 4, 3, 2, 1))
def iterative_bubble_sort(data):
data = data[:]
for end in (range(len(data)-1, 0, -1)):
for i in range(end):
if data[i] < data[i+1]:
data[i], data[i+1] = data[i+1], data[i]
return data
old = [1, 5, 2, 4, 8]
new = iterative_bubble_sort(old)
print(old)
print(new)
def recursive_bubble_sort(data):
data = data[:]
if len(data) == 1:
return data
last = data.pop()
sorted_data = recursive_bubble_sort(data)
for i in range(len(sorted_data)):
if last > sorted_data[i]:
sorted_data.insert(i, last)
break
else:
sorted_data.append(last)
return sorted_data
old = [1, 5, 2, 4, 8]
new = recursive_bubble_sort(old)
print(old)
print(new)
Modules
Goal of having modules
- Code reuse: Allow multiple script to reuse the same function without copying the code.
- Better code design.
- Separation of concerns: Functions dealing with one subject are grouped together in one module.
Before modules
Let's take a very simple script that has a single, and very simple function in it.
def add(a, b):
return a + b
z = add(2, 3)
print(z) # 5
Create modules
A module is just a Python file with a set of functions that us usually not used by itself. For example the "my_calculator.py".
def add(a, b):
return a + b
A user made module is loaded exactly the same way as the built-in module. The functions defined in the module are used as if they were methods with the dot-notation.
import my_calculator
z = my_calculator.add(2, 3)
print(z) # 5
We can import specific functions to the current name space (symbol table) and then we don't need to prefix it with the name of the file every time we use it. This might be shorter writing, but if we import the same function name from two different modules then they will overwrite each other. So I usually prefer loading the module as in the previous example.
from my_calculator import add
print(add(2, 3)) # 5
- Using with an alias
import my_calculator as calc
z = calc.add(2, 3)
print(z) # 5
path to load modules from - The module search path
- PYTHONPATH
- .pth
There are several steps Python does when it searches for the location of a file to be imported, but the most important one is what we see on the next page in sys.path.
- The directory where the main script is located.
- The directories listed in PYTHONPATH environment variable.
- Directories of standard libraries.
- Directories listed in .pth files.
- The site-packages home of third-party extensions.
sys.path - the module search path
- sys
- path
import sys
print(sys.path)
['/Users/gabor/work/training/python/examples/package',
'/Users/gabor/python/lib/python2.7/site-packages/crypto-1.1.0-py2.7.egg',
...
'/Library/Python/2.7/site-packages', '/usr/local/lib/python2.7/site-packages']
[Finished in 0.112s]
Project directory layouts
- Flat project
- Absolute path
- Relative path
- Using submodules
Flat project directory structure
If our executable scripts and our modules are all in the same directory then we don't have to worry ad the directory of the script is included in the list of places where "import" is looking for the files to be imported.
project/
script_a.py
script_b.py
my_module.py
Absolute path
If we would like to load a module that is not installed in one of the standard locations, but we know where it is located on our disk, we can set the "sys.path" to the absolute path to this directory. This works on the specific computer, but if you'd like to distribute the script to other computers you'll have to make sure the module to be loaded is installed in the same location or you'll have to update the script to point to the location of the module in each computer. This is not an ideal solution.
import sys
# On Linux
sys.path.insert(0, "/home/foobar/python/libs")
# On Windows
# sys.path.insert(0, r"c:\Users\FooBar\python\libs")
# import module_name
Relative path
- file
- dirname
- abspath
- sys.path
../project_root/
bin/relative_path.py
lib/my_module.py
We can use a directory structure that is more complex than the flat structure we had earlier. In this case the location of the modules relatively to the scripts is fixed. In this case it is "../lib". We can compute the relative path in each of our scripts. That will ensure we pick up the right module every time we run the script. Regardless of the location of the whole project tree.
def run():
print("Hello from my_module")
import os
import sys
project_root = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
sys.path.insert(0, os.path.join(project_root, 'lib'))
import my_module
my_module.run()
Relative path explained
../project_root/
bin/relative_path_explained.py
lib/my_module.py
import os
import sys
print(__file__)
print(os.path.abspath(__file__))
print(os.path.dirname(os.path.abspath(__file__)))
project_root = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
print(project_root)
mypath = os.path.join(project_root, 'lib')
print(mypath)
sys.path.insert(0, mypath)
import my_module
my_module.run()
examples/project_root/bin/relative_path_explained.py
/home/gabor/work/slides/python/examples/project_root/bin/relative_path_explained.py
/home/gabor/work/slides/python/examples/project_root/bin
/home/gabor/work/slides/python/examples/project_root
/home/gabor/work/slides/python/examples/project_root/lib
Hello from my_module
Submodules
aproject/
app.py
mymodules/math.py
import mymodules.math
z = mymodules.math.add(2, 3)
print(z)
def add(x, y):
return x + y
Python modules are compiled
- pyc
- pycache
When libraries are loaded they are automatically compiled to .pyc
files.
This provides moderate code-hiding and load-time speed-up. Not run-time speed-up.
Starting from Python 3.2 the pyc files are saved in the __pycache__
directory.
How "import" and "from" work?
- import
- Find the file to load.
- Compile to bytecode if necessary and save the bytecode if possible.
- Run the code of the file loaded.
- Copy names from the imported module to the importing namespace.
Execute at import time
import lib
print("Hello")
print("import lib")
def do_something():
print("do something")
import lib
Hello
Runtime loading of modules
The import statements in Python are executed at the point where they are located in the code. If you have some code before the import statement (print Start running) it will be executed before the importing starts.
During the importing any code that is outside of functions and classes in the imported module is executed. (print Loading mygreet).
Then you can call functions from the module (print Hello World).
Or call code that is in the importing program (print DONE).
def hello():
print("Hello World")
print("Loading mygreet")
import mygreet
print("Start running") # Start running
import mygreet # Loading mygreet
print("import done") # import done
mygreet.hello() # Hello World
print("DONE") # DONE
Conditional loading of modules
import random
print("Start running")
name = input("Your name:")
if name == "Foo":
import mygreet
mygreet.hello()
else:
print('No loading')
print("DONE")
What is in our namespace?
print(dir())
import sys
print(dir())
from sys import argv
print(dir())
['__annotations__', '__builtins__', '__cached__', '__doc__',
'__file__', '__loader__', '__name__', '__package__', '__spec__']
['__annotations__', '__builtins__', '__cached__', '__doc__',
'__file__', '__loader__', '__name__', '__package__', '__spec__', 'sys']
['__annotations__', '__builtins__', '__cached__', '__doc__',
'__file__', '__loader__', '__name__', '__package__', '__spec__', 'argv', 'sys']
Runtime import
- We can use the name of a module that comes from an expression
Duplicate importing of functions
from mycalc import add
print(add(2, 3)) # 5
from mymath import add
print(add(2, 3)) # 6
from mycalc import add
print(add(2, 3)) # 5
The second declaration silently overrides the first declaration.
pylint can find such problems, along with a bunch of others.
pylint --disable=C duplicate_add_from_module.py
************* Module duplicate_add_from_module
duplicate_add_from_module.py:4:0: W0404: Reimport 'add' (imported line 1) (reimported)
duplicate_add_from_module.py:7:0: W0404: Reimport 'add' (imported line 1) (reimported)
------------------------------------------------------------------
Your code has been rated at 6.67/10 (previous run: 5.00/10, +1.67)
Duplicate importing of functions - solved
import mycalc
print(mycalc.add(2, 3)) # 5
import mymath
print(mymath.add(2, 3)) # 6
import mycalc
print(mycalc.add(2, 3)) # 5
Script or library
- main
- name
We can have a file with all the functions implemented and then launch the run() function only if the file was executed as a stand-alone script.
def run():
print("run in ", __name__)
print("Name space in mymodule.py ", __name__)
if __name__ == '__main__':
run()
$ python mymodule.py
Name space in mymodule.py __main__
run in __main__
Script or library - import
If it is imported by another module then it won't run automatically. We have to call it manually.
import mymodule
print("Name space in import_mymodule.py ", __name__)
mymodule.run()
$ python import_mymodule.py
Name space in mymodule.py mymodule
Name space in import_mymodule.py __main__
run in mymodule
Script or library - from import
from mymodule import run
print("Name space in import_mymodule.py ", __name__)
run()
$ python import_from_mymodule.py
Name space in mymodule.py mymodule
Name space in import_mymodule.py __main__
run in mymodule
Scope of import
def div(a, b):
return a/b
from __future__ import print_function
from __future__ import division
import mydiv
print(mydiv.div(3, 2)) # 1
print(3/2) # 1.5
The importing of functions, and the changes in the behavior of the compiler are file specific. In this case the change in the behavior of division is only visible in the division.py script, but not in the mydiv.py module.
Import multiple times
import one
import two
print("Hello")
import common
print("loading one")
import common
print("loading two")
print("import common")
import common
loading one
loading two
Hello
Do not import *
- Despite the examples you can use in various places, I'd recommend never to import "everything" using
*
.
from one import *
from two import *
run()
- Where does
run()
come from? - What if both moduldes have the
run()
function? Then the order of the import will be important. - What if the
one
has therun()
function, but a new version oftwo
also adds one?
Exercise: Number guessing
Take the number guessing game and move the function out to a separate file and use it as a module.
Exercies: Scripts and modules
Take the number guessing game:
If I run it as a script like this: python game.py
then execute the whole game. Allow the user to play several games each time with a new hidden number.
If I load it as a module, then let me call the function that runs a single game with one hidden number. For example:
import game
game.run_game() # will generate a new hidden number
We should be able to even pass the hidden number as a parameter. Like this:
import game
game.run_game(42)
Exercise: Module my_sum
-
Create a file called
my_simple_math.py
with two functions:div(a, b)
,add(a, b)
, that will divide and add the two numbers respectively. -
Add another two functions called
test_div
andtest_add
that will test the above two functions using assert. -
Add code that will run the tests if someone execute
python my_simple_math.py
running the file as if it was a script. -
Create another file called
use_my_simple_math.py
that will use the functions frommy_math
module to calculate 2 + 5 * 7 -
Make sure when you run
python use_my_simple_math.py
the tests won't run. -
Add documentation to the "add" and "div" functions to examples that can be used with doctest.
-
Can you run the tests when the file is loaded as a module?
Exercise: Convert your script to module
- Take one of your real scripts (from work or from a previous assignment). Create a backup copy.
- Change the script so it can be import-ed as a module and then it won't automatically execute anything, but that it still works when executed as a script.
- Add a new function to it called
self_test
and in that function add a few test-cases to your code using 'assert'. - Write another script that will load your real file as a module and will run the
self_test
. - Let me know what are the dificulties!
Exercise: Add doctests to your own code
- Pick a module from your own code and create a backup copy. (from work)
- Add a function called 'self_test' that uses 'assert' to test some of the real functions of the module.
- Add code that will run the 'self_test' when the file is executed as a script.
- Add documentation to one of the functions and convert the 'assert'-based tests to doctests.
- Convert the mechanism that executed the 'self_test' to run the doctests as well.
- Let me know what are the dificulties!
Solution: Module my_sum
def div(a, b):
'''
>>> div(8, 2)
4
'''
return a/b
def add(a, b):
'''
>>> add(2, 2)
4
'''
return a * b # bug added on purpose!
def test_div():
assert div(6, 3) == 2
assert div(0, 10) == 0
assert div(-2, 2) == -1
#assert div(10, 0) == ??
def test_add():
assert add(2, 2) == 4
#assert add(1, 1) == 2
if __name__ == "__main__":
test_div()
test_add()
import my_simple_math
print(my_simple_math.my_sum(2, 3, 5))
print(dir(my_simple_math))
#my_sum_as_function.test_my_sum()
Loaded modules and their path
for mod in sorted(sys.modules.keys()):
try:
print(mod, sys.modules[mod].__file__)
except Exception as err:
print(mod)
Built-in modules
import sys
for mod in sys.builtin_module_names:
print(mod)
assert to verify values
- assert
- raise
- Exception
def add(x, y):
return x * y
for x, y, z in [(2, 2, 4), (9, 2, 11), (2, 3, 5)]:
print(f"add({x}, {y}) == {z}")
if add(x, y) != z:
raise Exception(f"add({x}, {y}) != {z}")
#raise AssertionError
add(2, 2) == 4
add(9, 2) == 11
Traceback (most recent call last):
File "examples/functions/raise_exception.py", line 7, in <module>
raise Exception(f"add({x}, {y}) != {z}")
Exception: add(9, 2) != 11
def add(x, y):
return x * y
for x, y, z in [(2, 2, 4), (9, 2, 11), (2, 3, 5)]:
print(f"add({x}, {y}) == {z}")
assert add(x, y) == z
add(2, 2) == 4
add(9, 2) == 11
Traceback (most recent call last):
File "examples/functions/assert.py", line 6, in <module>
assert add(x, y) == z
AssertionError
mycalc as a self testing module
- file
import mycalc
print(mycalc.add(19, 23))
$ python use_mycalc.py
42
def test_add():
print('Testing {}'.format(__file__))
assert add(1, 1) == 2
assert add(-1, 1) == 0
# assert add(-99, 1) == 0 # AssertionError
def add(a, b):
return a + b
if __name__ == '__main__':
test_add()
$ python mycalc.py
Self testing mycalc.py
doctest
- doctest
def fib(n):
'''
Before the tests
>>> fib(3)
2
>>> fib(10)
55
>>> [fib(n) for n in range(11)]
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55]
>>> fib(11)
89
After the tests
'''
values = [0, 1]
if n == 11:
return 'bug'
while( n > len(values) -1 ):
values.append(values[-1] + values[-2])
return values[n]
#if __name__ == "__main__":
# import doctest
# doctest.testmod()
python -m doctest fibonacci_doctest.py
python examples/modules/fibonacci_doctest.py
**********************************************************************
File ".../examples/modules/fibonacci_doctest.py", line 12, in __main__.fib
Failed example:
fib(11)
Expected:
89
Got:
'bug'
**********************************************************************
1 items had failures:
1 of 4 in __main__.fib
***Test Failed*** 1 failures.
Export import
-
all
-
import
-
from
-
from mod import a,b,_c - import 'a', 'b', and '_c' from 'mod'
-
from mod import * - import every name listed in all of 'mod' if all is available.
-
from mod import * - import every name that does NOT start with _ (if all is not available)
-
import mod - import 'mod' and make every name in 'mod' accessible as 'mod.a', and 'mod._c'
def a():
return "in a"
b = "value of b"
def _c():
return "in _c"
def d():
return "in d"
from my_module import a,b,_c
print(a()) # in a
print(b) # value of b
print(_c()) # in _c
print(d())
# Traceback (most recent call last):
# File ".../examples/modules/x.py", line 7, in <module>
# print(d())
# NameError: name 'd' is not defined
from my_module import *
print(a()) # in a
print(b) # value of b
print(d()) # in d
print(_c())
# Traceback (most recent call last):
# File ".../examples/modules/y.py", line 9, in <module>
# print(_c()) # in _c
# NameError: name '_c' is not defined
Export import with all
- all
__all__ = ['a', '_c']
def a():
return "in a"
b = "value of b"
def _c():
return "in _c"
def d():
return "in d"
from my_module2 import *
print(a()) # in a
print(_c()) # in _c
print(b)
# Traceback (most recent call last):
# File ".../examples/modules/z.py", line 7, in <module>
# print(b) # value of b
# NameError: name 'b' is not defined
import module
import my_module
print(my_module.a()) # in a
print(my_module.b) # value of b
print(my_module._c()) # in _c
print(my_module.d()) # in d
deep copy list
a = [
{
'name': 'Joe',
'email': 'joe@examples.com',
},
{
'name': 'Mary',
'email': 'mary@examples.com',
},
]
b = a
a[0]['phone'] = '1234'
a[0]['name'] = 'Jane'
a.append({
'name': 'George'
})
print(a)
print(b)
[{'name': 'Jane', 'email': 'joe@examples.com', 'phone': '1234'}, {'name': 'Mary', 'email': 'mary@examples.com'}, {'name': 'George'}]
[{'name': 'Jane', 'email': 'joe@examples.com', 'phone': '1234'}, {'name': 'Mary', 'email': 'mary@examples.com'}, {'name': 'George'}]
a = [
{
'name': 'Joe',
'email': 'joe@examples.com',
},
{
'name': 'Mary',
'email': 'mary@examples.com',
},
]
b = a[:]
a[0]['phone'] = '1234'
a[0]['name'] = 'Jane'
a.append({
'name': 'George'
})
print(a)
print(b)
[{'name': 'Jane', 'email': 'joe@examples.com', 'phone': '1234'}, {'name': 'Mary', 'email': 'mary@examples.com'}, {'name': 'George'}]
[{'name': 'Jane', 'email': 'joe@examples.com', 'phone': '1234'}, {'name': 'Mary', 'email': 'mary@examples.com'}]
from copy import deepcopy
a = [
{
'name': 'Joe',
'email': 'joe@examples.com',
},
{
'name': 'Mary',
'email': 'mary@examples.com',
},
]
b = deepcopy(a)
a[0]['phone'] = '1234'
a[0]['name'] = 'Jane'
a.append({
'name': 'George'
})
print(a)
print(b)
[{'name': 'Jane', 'email': 'joe@examples.com', 'phone': '1234'}, {'name': 'Mary', 'email': 'mary@examples.com'}, {'name': 'George'}]
[{'name': 'Joe', 'email': 'joe@examples.com'}, {'name': 'Mary', 'email': 'mary@examples.com'}]
deep copy dictionary
a = {
'name': 'Foo Bar',
'grades': {
'math': 70,
'art' : 100,
},
'friends': ['Mary', 'John', 'Jane', 'George'],
}
b = a
a['grades']['math'] = 90
a['email'] = 'foo@bar.com'
print(a)
print(b)
{'name': 'Foo Bar', 'grades': {'math': 90, 'art': 100}, 'friends': ['Mary', 'John', 'Jane', 'George'], 'email': 'foo@bar.com'}
{'name': 'Foo Bar', 'grades': {'math': 90, 'art': 100}, 'friends': ['Mary', 'John', 'Jane', 'George'], 'email': 'foo@bar.com'}
- [deepcopy](https://docs.python.org/library/copy.html#copy.deepcopy" %}
from copy import deepcopy
a = {
'name': 'Foo Bar',
'grades': {
'math': 70,
'art' : 100,
},
'friends': ['Mary', 'John', 'Jane', 'George'],
}
b = deepcopy(a)
a['grades']['math'] = 90
a['email'] = 'foo@bar.com'
print(a)
print(b)
{'name': 'Foo Bar', 'grades': {'math': 90, 'art': 100}, 'friends': ['Mary', 'John', 'Jane', 'George'], 'email': 'foo@bar.com'}
{'name': 'Foo Bar', 'grades': {'math': 70, 'art': 100}, 'friends': ['Mary', 'John', 'Jane', 'George']}
Python standard modules (standard packages)
Some Standard packages
-
sys - (Python) System specific
-
os - Operating System
-
stat - Inode table
-
shutil - File Operations
-
glob - Unix style pathname expansion
-
subprocess - Processes
-
argparse - Command Line Arguments
-
re - Regexes
-
math - Mathematics
-
time - Timestamp and friends
-
datetime - Time management
-
random - Random numbers
math
math examples
import math
print(math.pi) # 3.141592653589793
print(math.e) # 2.718281828459045
print(math.sin(23)) # -0.8462204041751706
print(math.perm(3)) # 6 permutations
print(math.perm(4)) # 24 permutations
print(math.perm(4, 2)) # 12 permutations
print(math.lcm(120, 42)) # 840 least common multiple
print(math.gcd(120, 42)) # 6 greatest common divisor
sys
sys module
-
sys
-
argv
-
executable
-
path
-
version_info
import sys
print(sys.argv) # the list of the values
# on the command line sys.argv[0] is the name of the Python script
print(sys.executable) # path to the python interpreter
# print(sys.path)
# list of file-system path strings for searching for modules
# hard-coded at compile time but can be changed via the PYTHONPATH
# environment variable or during execution by modifying sys.path
print(sys.version_info)
# sys.version_info(major=2, minor=7, micro=12, releaselevel='final', serial=0)
# sys.version_info(major=3, minor=8, micro=2, releaselevel='final', serial=0)
print(sys.version_info.major) # 2 or 3
print(sys.platform) # darwin or linux or win32
['examples/sys/mysys.py']
/home/gabor/venv3/bin/python
sys.version_info(major=3, minor=9, micro=7, releaselevel='final', serial=0)
3
linux
Later we'll see also the platform
module for more details of the Operating System.
Writing to standard error (stderr)
- stdout
- stderr
- write
import sys
print("on stdout (Standard Output)")
print("on stderr (Standard Error)", file=sys.stderr)
sys.stderr.write("on stderr using write\n")
# x = 0
# print(1/x)
Redirection (Works on Linux/Mac/Windows):
python stderr.py > out.txt 2> err.txt
python stderr.py > all.txt 2>&1
python stderr.py 2> /dev/null # On Linux and OSX
python stderr.py 2> nul # On Windows
exit prints to STDERR
import sys
if len(sys.argv) != 2:
exit(f"Usage: {sys.argv[0]} NUMBER")
print(f"you sent in {sys.argv[1]}")
os
python which OS are we running on (os, platform)
import os
import platform
print("Name: ", os.name)
print("System: ", platform.system())
print("Release: ", platform.release())
print("Architecture:", platform.architecture())
print("Machine: ", platform.machine())
print("Processor: ", platform.processor())
print("Release: ", platform.release())
print("Version: ", platform.version())
# On Windows:
# nt
# Windows
# 10
if platform.system() != 'Windows':
print("Uname: ", os.uname())
# On Windows uname is not available
- Linux
Name: posix
System: Linux
Release: 5.13.0-37-generic
Architecture: ('64bit', 'ELF')
Machine: x86_64
Processor: x86_64
Release: 5.13.0-37-generic
Version: #42-Ubuntu SMP Tue Mar 15 14:34:06 UTC 2022
Uname: posix.uname_result(sysname='Linux', nodename='code-maven', release='5.13.0-37-generic', version='#42-Ubuntu SMP Tue Mar 15 14:34:06 UTC 2022', machine='x86_64')
- MacOSX
Name: posix
System: Darwin
Release: 20.6.0
Architecture: ('64bit', '')
Machine: x86_64
Processor: i386
Release: 20.6.0
Version: Darwin Kernel Version 20.6.0: Mon Aug 30 06:12:21 PDT 2021; root:xnu-7195.141.6~3/RELEASE_X86_64
Uname: posix.uname_result(sysname='Darwin', nodename='FooBar', release='20.6.0',
version='Darwin Kernel Version 20.6.0: Mon Aug 30 06:12:21 PDT 2021;
root:xnu-7195.141.6~3/RELEASE_X86_64', machine='x86_64')
Current directory (getcwd, pwd, chdir)
- getcwd
- pwd
- chdir
import sys
import os
to_dir = '..'
# to_dir = '/path/to/some/dir'
if len(sys.argv) == 2:
to_dir = sys.argv[1]
current_dir = os.getcwd()
print(current_dir)
os.chdir(to_dir)
new_dir = os.getcwd()
print(new_dir)
Linux, OSX:
$ pwd
Windows: (cd without parameters prints the current working directory)
> cd
OS path
- path
- abspath
- exists
- basename
- dirname
import sys
import os
path_to_thing = __file__
if len(sys.argv) == 2:
path_to_thing = sys.argv[1]
print(path_to_thing)
print( os.path.basename(path_to_thing) )
print( os.path.dirname(path_to_thing) )
print( os.path.abspath(path_to_thing) )
print( os.path.exists(path_to_thing) )
print( os.path.isdir(path_to_thing) )
print( os.path.isfile(path_to_thing) )
os.path.join
- os.path.join
- join
import os
dirname = 'home'
subdirname = 'foo'
filename = 'work.txt'
path = f"{dirname}\\{subdirname}\\{filename}"
print(path) # home\foo\work.txt
path = os.path.join(dirname, subdirname, filename)
print(path)
# Linux, OSX: home/foo/work.txt
# Windows: home\foo\work.txt
Directory listing
- dir
- listdir
- path
- os.listdir
import os
import sys
if len(sys.argv) != 2:
exit("Usage: {} directory".format(sys.argv[0]))
path = sys.argv[1]
things = os.listdir(path)
for name in things:
print(name)
print(os.path.join(path, name))
Directory listing using glob
- glob
- glob.glob
import sys
import glob
if len(sys.argv) == 2:
exp = sys.argv[1]
print(exp)
items = glob.glob(exp)
print(items)
else:
files = glob.glob("?[abcdef]*.py")
print(files)
files = glob.glob("/usr/bin/*.sh")
print(files)
Traverse directory tree - list directories recursively
- walk
- os.walk
import os
import sys
if len(sys.argv) != 2:
exit("Usage: {} PATH_TO_DIRECTORY".format(sys.argv[0]))
root = sys.argv[1]
for dirname, dirs, files in os.walk(root):
#print(dirname) # relative path (from cwd) to the directory being processed
#print(dirs) # list of subdirectories in the currently processed directory
#print(files) # list of files in the currently processed directory
for filename in files:
print(os.path.join(dirname, filename)) # relative path to the "current" file
OS dir (mkdir, makedirs, remove, rmdir)
-
mkdir
-
makedirs
-
remove
-
unlink
-
rmdir
-
removedirs
-
rmtree
-
shutil
-
mkdir
is likemkdir
in Linux and Windows -
makedirs
is likemkdir -p
in Linux -
remove
andunlink
are likerm -f
in Linux ordel
in Windows -
rmdir
is likermdir
import os
import shutil
# create a single directory
path_to_new_dir = 'abc'
os.mkdir(path_to_new_dir)
# create also the parent directories, if needed
path_to_new_dir = 'dir/subdir/subdir'
# os.mkdir(path_to_new_dir) # will fail if 'dir' or 'dir/subdir' does not exist
os.makedirs(path_to_new_dir)
# remove a file (both)
os.remove(path_to_file)
os.unlink(path_to_file)
# remove single empty directory
os.rmdir(path_to_dir)
# remove directory tree if there are no files in them
os.removedirs(path_to_dir)
# Remove a whole directory structure (subdirs and files)
# Like rm -rf
shutil.rmtree(path_to_dir)
expanduser - handle tilde ~ the home directory of the user
- expanduser
- ~
- os.path.expanduser
import os
# The home directory of the current user
home_directory = os.path.expanduser("~")
print(home_directory)
# /home/gabor
# 'C:\\Users\\Gabor Szabo'
Get process ID
-
getpid
-
getppid
-
Works on all 3 Operating systems
import os
print(os.getpid())
print(os.getppid())
93518
92859
This is on Linux/OSX
echo $$
External command with system
- os.system
- system
import os
command = 'ls -l'
exit_code = os.system(command)
# $? on Linux/OSX
# %ERRORLEVEL% on Windows
print(exit_code)
exit_code = os.system('ls qqrq')
print(exit_code)
print(exit_code // 256)
exit_code = os.system('ls /root')
print(exit_code)
print(exit_code // 256)
If you wanted to list the content of a directory in an os independent way you'd use os.listdir('.')
or you could use the glob.glob("*.py")
function to have a subset of files.
Accessing the system environment variables from Python
- os.environ
import os
print(os.environ['HOME']) # /Users/gabor
print(os.environ.get('HOME')) # /Users/gabor
for k in os.environ.keys():
print("{:30} {}".format(k , os.environ[k]))
os.environ
is a dictionary where the keys are the environment variables and the values are, well, the values.
Set environment variables on the fly
import os
print(os.environ.get('MYNAME'))
print(os.getenv('MYNAME'))
- On Linux and macOS:
MYNAME=Foo python examples/os/show_env.py
Reading the .env environment file in Python
.env
file in the same folder where the program is.
{% embed include file="src/examples/os/.env)
import os
print(os.environ.get('MYNAME'))
print(os.getenv('MYNAME'))
pip install python-dotenv
python examples/os/read_env.py
SOME_THING=other python examples/os/read_env.py
Set env and run command
import os
os.system("echo hello")
os.system("echo $HOME")
os.system("echo Before $MY_TEST")
os.environ['MY_TEST'] = 'qqrq'
os.system("echo After $MY_TEST")
We can change the environment variables and that change will be visible in subprocesses, but once we exit from ou Python program, the change will not persist.
Pathlib
Pathlib example
-
Path
from pathlib import Path
file = Path("python.json")
print(file)
print(file.__class__.__name__) # PosixPath
Pathlib cwd
- cwd
from pathlib import Path
cwd = Path.cwd()
print(cwd)
print(cwd.__class__.__name__) # PosixPath
Pathlib get extension (suffix)
- suffix
from pathlib import Path
file = Path("path/to/code.py")
print(file.suffix) # .py
print(file.suffix.__class__.__name__) # str
file = Path("path/to/code.yaml")
print(file.suffix) # .yaml
file = Path("path/to/.bashrc")
print(file.suffix) # (empty string)
folder = Path("path/to")
print(folder.suffix) # (empty string)
Pathlib current file
- file
from pathlib import Path
this = Path(__file__)
print(this)
Pathlib parents (dirname)
- parent
- patents
- dirname
from pathlib import Path
this = Path(__file__)
print(this)
print(this.parent) # dirname
print(this.parents[0]) # dirname (first parent)
print(this.parents[1]) # grandparent
...
print(this.parents[-1]) # /
Pathlib parts (basename)
- basename
- parts
from pathlib import Path
this = Path(__file__)
print(this)
print(this.parts[-1]) # (basename)
print(this.parts[0]) # /
print(this.parts) # (each part of the path)
Pathlib exists
- exists
from pathlib import Path
file = Path(__file__)
print(file.exists()) # True
file = Path("hello.txt")
print(file.exists()) # False
folder = Path(".")
print(folder.exists()) # True
Pathlib iterdir (flat)
-
iterdir
-
Iterate over the things (file names, folder names, etc.) in a folder.
from pathlib import Path
folder = Path(".")
for item in folder.iterdir():
print(item)
Pathlib mkdir (makedir)
- mkdir
- makedir
from pathlib import Path
folder = Path("abc")
folder.mkdir() # Creates "abc", fails if it already exists
Path("something").joinpath("else").mkdir(parents=True, exist_ok=True)
# partnes - create intermediate folders as well
# exist_ok - don't fail if folder already exists
from pathlib import Path
folder = Path("/")
print(folder) # /
subfolder = folder.joinpath("etc")
print(subfolder) # /etc
file1 = subfolder.joinpath("a.txt")
print(file1) # /etc/a.txt
file2 = subfolder / "b.txt"
print(file2) # /etc/b.txt
shutil
shutil module
-
shutil
-
cp
-
copy
-
copytree
-
move
-
rmtree
-
shutil - File Operations
import shutil
shutil.copy(source, dest)
shutil.copytree(source, dest)
shutil.move(source, dest)
shutil.rmtree(path)
time
time module
- time
- timezone
- daylight
- gmtime
- strftime
import time
now = time.time()
print(now) # 1351178170.85
print(type(now)) # <class 'float'>
print(time.timezone) # -7200 = 2*60*60 (GMT + 2)
print(time.daylight) # 1 (DST or Daylight Saving Time)
print(time.gmtime()) # time.struct_time
# time.struct_time(tm_year=2012, tm_mon=10, tm_mday=25,
# tm_hour=17, tm_min=25, tm_sec=34, tm_wday=3, tm_yday=299, tm_isdst=0)
ts = time.gmtime()
print(ts.tm_year) # 2012
print(time.strftime('%Y-%m-%d %H:%M:%S')) # 2012-10-25 17:16:10
timestamp = 1051178170
print(time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(timestamp))) # 2003-04-24 12:56:10
print(time.strftime('%Y-%m-%d %H:%M:%S', time.gmtime(0))) # 1970-01-01 00:00:00
sleep in Python
- sleep
import time
start = time.time()
print("hello " + str(start))
time.sleep(3.5)
end = time.time()
print("world " + str(end))
print("Elapsed time:" + str(end-start))
hello 1475217162.472256
world 1475217165.973437
Elapsed time:3.501181125640869
timer
More time-related examples.
import random
import time
# https://docs.python.org/3/library/time.html#time.struct_time
print(time.time()) # time since the epoch in seconds
print(time.asctime()) # current local time in human-readable format
print(time.strftime("%Y-%m-%d %H:%M:%S")) # create your own human-readable format
print(time.gmtime(0)) # epoch
print(time.asctime(time.gmtime(0))) # epoch in human-readable format
print(time.localtime()) # local time now
print(time.gmtime()) # time in London
print(time.process_time())
print(time.process_time_ns())
s = time.perf_counter()
ps = time.process_time()
print(time.monotonic())
time.sleep(0.1)
print(time.monotonic())
e = time.perf_counter()
for _ in range(100000):
random.random()
pe = time.process_time()
print(s)
print(e)
print(e-s)
print(pe-ps)
# print(time.get_clock_info('monotonic'))
Current date and time datetime now
- datetime
- now
- strftime
import datetime
now = datetime.datetime.now()
print(now) # 2015-07-02 16:28:01.762244
print(type(now)) # <type 'datetime.datetime'>
print(now.year) # 2015
print(now.month) # 7
print(now.day) # 2
print(now.hour) # 16
print(now.minute) # 28
print(now.second) # 1
print(now.microsecond) # 762244
print(now.strftime("%Y%m%d-%H%M%S-%f")) # 20150702-162801-762244
print(now.strftime("%B %b %a %A")) # July Jul Thu Thursday
print(now.strftime("%c")) # Thu Jul 2 16:28:01 2015
Converting string to datetime (parsing date and time string)
- strptime
import datetime
date = "2012-12-19"
some_day = datetime.datetime.strptime(date, '%Y-%m-%d') # YYYY-MM-DD
print(type(some_day)) # <type 'datetime.datetime'>
print(some_day) # 2012-12-19
timestamp = "2013-11-04 11:23:45" # YYYY-MM-DD HH:MM:SS
some_time = datetime.datetime.strptime(timestamp, '%Y-%m-%d %H:%M:%S')
print(type(some_time)) # <class 'datetime.datetime'>
print(some_time) # 2013-11-04
print(some_time.minute) # 23
# Make sure you know how was the date formatted!
date = "12/3/2012"
dt = datetime.datetime.strptime(date, '%m/%d/%Y') # MM/DD/YYYY date format in USA
print(dt) # 2012-12-03 00:00:00
dt = datetime.datetime.strptime(date, '%d/%m/%Y') # DD/MM/YYYY date format elsewher
print(dt) # 2012-03-12 00:00:00
Parse datetime string with and without timezone information
- strptime
- tzinfo
import datetime
dt = datetime.datetime.strptime('Jun 7, 2022', '%b %d, %Y')
print(dt)
print(dt.tzinfo)
dt_utc = datetime.datetime.strptime('Jun 7, 2022 +0000', '%b %d, %Y %z')
print(dt_utc)
print(dt_utc.tzinfo)
datetime fromisoformat
- fromisoformat
import datetime
dt = datetime.datetime.fromisoformat('2000-01-01')
print(dt) # 2000-01-01 00:00:00
date fromisoformat (only date, no time)
- date
- fromisoformat
import datetime
date = datetime.date.fromisoformat('2000-01-01')
print(date) # 2000-01-01
datetime arithmeticis (subtract)
- timedelta
- total_seconds
- strptime
import datetime
t1 = "2013-12-29T11:23:45"
t2 = "2014-01-02T10:19:49"
dt1 = datetime.datetime.strptime(t1, '%Y-%m-%dT%H:%M:%S')
dt2 = datetime.datetime.strptime(t2, '%Y-%m-%dT%H:%M:%S')
print(dt1) # 2013-12-29 11:23:45
print(dt2) # 2014-01-02 10:19:49
diff = dt2-dt1
print(diff) # 3 days, 22:56:04
print(type(diff)) # <type 'datetime.timedelta'>
print(diff.total_seconds()) # 341764.0
time_travel = dt1-dt2
print(time_travel) # -4 days, 1:03:56
print(time_travel.total_seconds()) # -341764.0
# d = dt1+dt2
# TypeError: unsupported operand type(s) for +: 'datetime.datetime' and 'datetime.datetime'
Timezone aware datetime
- tzinfo
import datetime
ts = "2022-12-20T11:23:45"
# Naive datetime object:
dt = datetime.datetime.strptime(ts, '%Y-%m-%dT%H:%M:%S')
now = datetime.datetime.now()
print(now) # 2022-12-25 22:39:39.093285
print(dt.tzinfo) # None
print(now.tzinfo) # None
elapsed = now-dt
print(elapsed) # 5 days, 11:15:54.093285
print(elapsed.total_seconds()) # 472554.093285
print()
# (Timezone) aware datetime object:
dt_utc = datetime.datetime.strptime(f'{ts}+0000', '%Y-%m-%dT%H:%M:%S%z')
now_utc = datetime.datetime.now(datetime.timezone.utc)
print(now_utc) # 2022-12-25 21:39:39.093880+00:00
print(dt_utc.tzinfo) # UTC
print(now_utc.tzinfo) # UTC
elapsed_utc = now_utc-dt_utc
print(elapsed_utc) # 5 days, 10:15:54.093880
print(elapsed_utc.total_seconds()) # 468954.09388
datetime arithmeticis (compare, sort)
import datetime
t1 = "2013-12-29T11:23:45"
t2 = "2014-01-02T10:19:49"
dt1 = datetime.datetime.strptime(t1, '%Y-%m-%dT%H:%M:%S')
dt2 = datetime.datetime.strptime(t2, '%Y-%m-%dT%H:%M:%S')
dt3 = datetime.datetime.strptime(t2, '%Y-%m-%dT%H:%M:%S')
print(dt1) # 2013-12-29 11:23:45
print(dt2) # 2014-01-02 10:19:49
print(dt2 > dt1) # True
print(dt1 > dt2) # False
print(dt2 == dt3) # True
print(dt2 == dt1) # False
dates = [dt2, dt1, dt3]
print(dates)
# [datetime.datetime(2014, 1, 2, 10, 19, 49), datetime.datetime(2013, 12, 29, 11, 23, 45), datetime.datetime(2014, 1, 2, 10, 19, 49)]
print(sorted(dates))
# [datetime.datetime(2013, 12, 29, 11, 23, 45), datetime.datetime(2014, 1, 2, 10, 19, 49), datetime.datetime(2014, 1, 2, 10, 19, 49)]
datetime arithmeticis (addition)
import datetime
timestamp = "2013-12-29T11:23:45"
ts = datetime.datetime.strptime(timestamp, '%Y-%m-%dT%H:%M:%S')
print(type(ts))
diff = datetime.timedelta(days = 3)
print(diff)
nts = ts + diff
print(type(nts))
print(ts)
print(nts) # 2014-01-01 11:23:45
Rounding datetime object to nearest second (removing microseconds)
- microseconds
- microsecond
import datetime
# Old solution
now = datetime.datetime.now()
rounded = now - datetime.timedelta(microseconds=now.microsecond)
print(now) # 2019-11-01 07:11:19.930974
print(rounded) # 2019-11-01 07:11:19
# A simpler solution
ts = datetime.datetime.now().replace(microsecond=0)
print(ts) # 2019-11-01 07:11:20
Rounding datetime object to date (removing hours, minutes, seconds)
- microsecond
- second
- minute
- hour
import datetime
ts = datetime.datetime.now().replace(
microsecond=0,
second=0,
minute=0,
hour=0,
)
print(ts) # 2023-01-19 00:00:00
Convert datetime object to date object
- datetime
- now
- date
import datetime
ts = datetime.datetime.now().date()
print(ts) # 2023-01-19
Convert datetime object to time object
- time
import datetime
ts = datetime.datetime.now().time()
print(ts) # 09:07:02.846346
Today (date)
- date
- today
import datetime
now = datetime.datetime.now()
print(now.date())
print(type(now.date()))
today = datetime.date.today()
print(today)
print(type(today))
2023-04-17
<class 'datetime.date'>
2023-04-17
<class 'datetime.date'>
subprocess
External CLI tool to demo subprocess
- subprocess
- call
- execute
The process.py is a simple script we are going to use to demonstrate how an external program can be executed from within Python. It is a Python program, but you could do the exact same thing with any command-line application written in any language. We use this Python script as an example because we know you already have Python on your computer.
The external command:
import time
import sys
import os
if len(sys.argv) != 3:
exit(f"{sys.argv[0]} SECONDS EXIT_CODE")
print(f"process ID: {os.getpid()} parent ID: {os.getppid()}")
seconds = int(sys.argv[1])
exit_code = int(sys.argv[2])
for sec in range(seconds):
print("OUT {}".format(sec), flush=True)
print("ERR {}".format(sec), file=sys.stderr)
time.sleep(1)
exit(exit_code)
Try it on the command line: python process.py 3 7
Run with os.system
import os
import sys
exit_code = os.system(f"python process.py 5 2")
print(f'exit code: {exit_code // 256}')
Output:
OUT 0
ERR 0
OUT 1
ERR 1
OUT 2
ERR 2
OUT 3
ERR 3
OUT 4
ERR 4
exit code: 2
Run external process let STDOUT and STDERR through
import subprocess
import time
import os
import psutil
def run_process(command):
print(f"Before Popen {os.getpid()}")
proc = subprocess.Popen(command) # This starts runing the external process
print(f"After Popen of {proc.pid}")
psproc = psutil.Process(proc.pid)
print(f"name: {psproc.name()}")
print(f"cmdline: {psproc.cmdline()}")
time.sleep(1.5)
print("Before communicate")
proc.communicate()
print("After communicate")
exit_code = proc.returncode
return exit_code
print("Before run_process", flush=True)
exit_code = run_process(['python', 'process.py', '5', '0'])
print("After run_process", flush=True)
print(f'exit code: {exit_code}', flush=True)
Output:
Before run_process
Before Popen
After Popen
OUT 0
ERR 0
OUT 1
ERR 1
Before communicate
OUT 2
ERR 2
OUT 3
ERR 3
OUT 4
ERR 4
After communicate
After run_process
exit code: 0
Run external process and capture STDOUT and STDERR separately
import subprocess
import time
def run_process(command):
print("Before Popen")
proc = subprocess.Popen(command,
stdout = subprocess.PIPE,
stderr = subprocess.PIPE,
) # This starts runing the external process
print("After Popen")
time.sleep(1.5)
print("Before communicate")
out, err = proc.communicate()
print("After communicate")
# out and err are two strings
exit_code = proc.returncode
return exit_code, out, err
print("Before run_process")
exit_code, out, err = run_process(['python', 'process.py', '5', '0'])
print("After run_process")
print("")
print(f'exit code: {exit_code}')
print("")
print('out:')
for line in out.decode('utf8').split('\n'):
print(line)
print('err:')
for line in err.decode('utf8').split('\n'):
print(line)
Output:
Before run_process
Before Popen
After Popen
Before communicate
After communicate
After run_process
exit code: 0
out:
OUT 0
OUT 1
OUT 2
OUT 3
OUT 4
err:
ERR 0
ERR 1
ERR 2
ERR 3
ERR 4
Run external process and capture STDOUT and STDERR merged together
import subprocess
import time
def run_process(command):
print("Before Popen")
proc = subprocess.Popen(command,
stdout = subprocess.PIPE,
stderr = subprocess.STDOUT,
) # This starts runing the external process
print("After Popen")
time.sleep(1.5)
print("Before communicate")
out, err = proc.communicate()
print("After communicate")
# out and err are two strings
exit_code = proc.returncode
return exit_code, out, err
print("Before run_process")
exit_code, out, err = run_process(['python', 'process.py', '5', '0'])
print("After run_process")
print("")
print(f'exit code: {exit_code}')
print("")
print('out:')
for line in out.decode('utf8').split('\n'):
print(line)
print('err:')
print(err)
Output:
Before run_process
Before Popen
After Popen
Before communicate
After communicate
After run_process
exit code: 0
out:
OUT 0
ERR 0
OUT 1
ERR 1
OUT 2
ERR 2
OUT 3
ERR 3
OUT 4
ERR 4
err:
None
In this case stderr will always be None
.
subprocess in the background
In the previous examples we ran the external command and then waited till it finishes before doing anything else.
In some cases you might prefer to do something else while you are waiting - effectively running the process in the background. This also makes it easy to enforce a time-limit on the process. If it does not finish within a given amount of time (timeout) we raise an exception.
In this example we still collect the standard output and the standard error at the end of the process.
import subprocess
import sys
import time
def run_process(command, timeout):
print("Before Popen")
proc = subprocess.Popen(command,
stdout = subprocess.PIPE,
stderr = subprocess.PIPE,
)
print("After Popen")
while True:
poll = proc.poll() # returns the exit code or None if the process is still running
print(f"poll: {poll}")
time.sleep(0.5) # here we could actually do something useful
timeout -= 0.5
if timeout <= 0:
break
if poll is not None:
break
print(f"Final: {poll}")
if poll is None:
raise Exception("Timeout")
exit_code = proc.returncode
out, err = proc.communicate()
return exit_code, out, err
exit_code, out, err = run_process([sys.executable, 'process.py', '3', '0'], 6)
print("-----")
print(f"exit_code: {exit_code}")
print("OUT")
print(out.decode())
print("ERR")
print(err.decode())
Output:
Before Popen
After Popen
poll: None
poll: None
poll: None
poll: None
poll: None
poll: None
poll: None
poll: 0
Final: 0
-----
exit_code: 0
OUT
OUT 0
OUT 1
OUT 2
ERR
ERR 0
ERR 1
ERR 2
subprocess collect output while external program is running
For this to work properly the external program might need to set the output to unbuffered.
In Python by default prining to STDERR is unbuffered, but we had to pass flush=True
to the print
function to make it unbuffered for STDOUT as well.
import subprocess
import sys
import time
def run_process(command, timeout):
print("Before Popen")
proc = subprocess.Popen(command,
stdout = subprocess.PIPE,
stderr = subprocess.PIPE,
universal_newlines=True,
bufsize=0,
)
print("After Popen")
out = ""
err = ""
while True:
exit_code = proc.poll()
print(f"poll: {exit_code} {time.time()}")
this_out = proc.stdout.readline()
this_err = proc.stderr.readline()
print(f"out: {this_out}", end="")
print(f"err: {this_err}", end="")
out += this_out
err += this_err
time.sleep(0.5) # here we could actually do something useful
timeout -= 0.5
if timeout <= 0:
break
if exit_code is not None:
break
print(f"Final: {exit_code}")
if exit_code is None:
raise Exception("Timeout")
return exit_code, out, err
exit_code, out, err = run_process([sys.executable, 'process.py', '4', '3'], 20)
#exit_code, out, err = run_process(['docker-compose', 'up', '-d'], 20)
print("-----")
print(f"exit_code: {exit_code}")
print("OUT")
print(out)
print("ERR")
print(err)
Output:
Before Popen
After Popen
poll: None 1637589106.083494
out: OUT 0
err: ERR 0
poll: None 1637589106.6035957
out: OUT 1
err: ERR 1
poll: None 1637589107.6047328
out: OUT 2
err: ERR 2
poll: None 1637589108.6051855
out: OUT 3
err: ERR 3
poll: None 1637589109.6066446
out: err: poll: 0 1637589110.6227856
out: err: Final: 0
-----
exit_code: 0
OUT
ERR
Exercise: Processes
Given the following "external application":
import sys
import random
import time
def add_random(result_filename, count, wait, exception=''):
total = 0
for _ in range(int(count)):
total += random.random()
time.sleep(float(wait))
if exception:
raise Exception(exception)
with open(result_filename, 'w') as fh:
fh.write(str(total))
if __name__ == '__main__':
if len(sys.argv) != 4 and len(sys.argv) != 5:
exit(f"Usage: {sys.argv[0]} RESULT_FILENAME COUNT WAIT [EXCEPTION]")
add_random(*sys.argv[1:])
It could be run with a command like this to create the a.txt file:
python examples/process/myrandom.py a.txt 3 1
Or like this, to raise an exception before creating the b.txt file:
python examples/process/myrandom.py b.txt 3 1 "bad thing"
Or it could be used like this:
from myrandom import add_random
add_random('b.txt', 2, 3)
add_random('c.txt', 2, 3, 'some error')
Write a program that will do "some work" that can be run in parallel and collect the data. Make the code work in a single process by default and allow the user to pass a number that will be the number of child processes to be used. When the child process exits it should save the results in a file and the parent process should read them in.
The "some work" can be accessing 10-20 machines using "ssh machine uptime" and creating a report from the results.
It can be fetching 10-20 URLs and reporting the size of each page.
It can be any other network intensive task.
Measure the time in both cases
Subprocess TBD
Some partially ready examples
import time
import sys
import os
if len(sys.argv) != 2:
exit(f"Usage: {sys.argv[0]} SECONDS")
print(f"{int(time.time())} - start will take {sys.argv[1]} seconds (pid: {os.getpid()})", flush=True)
time.sleep(int(sys.argv[1]))
print(f"{int(time.time())} - started", flush=True)
while True:
time.sleep(1)
print(f"{int(time.time())} - running", flush=True)
#import os
#import time
#import signal
import subprocess
import sys
import time
#def test_hello():
# run_process([sys.executable, "examples/Hello-World.py"], )
#pid = os.fork()
#if pid is None:
# raise Exception("Could not fork")
#if pid:
# print(f"parent of {pid}")
# time.sleep(5)
# os.kill(pid, signal.SIGKILL)
#else:
# print("child")
# os.environ['PYTHONPATH'] = '.'
# os.exec("python examples/Hello-World.py")
def run_process(command, start_timeout):
sleep_time = 0.5
print(command)
proc = subprocess.Popen(command,
stdout = subprocess.PIPE,
stderr = subprocess.STDOUT,
universal_newlines=True,
bufsize=0,
)
out = ""
while True:
print("Loop")
exit_code = proc.poll() # returns the exit code or None if the process is still running
if exit_code is not None:
raise Exception("Server died")
print(exit_code)
this_out = proc.stdout.readline()
#this_err = proc.stderr.readline()
out += this_out
print(f"Before sleep {sleep_time} for a total of {start_timeout}")
time.sleep(sleep_time)
start_timeout -= sleep_time
if start_timeout <= 0:
proc.terminate()
raise Exception("The service has not properly started")
if "started" in out:
print(out)
print("--------")
print("It is now running")
print("--------")
break
print("Do something interesting here that takes 2 seconds")
time.sleep(2)
proc.terminate()
exit_code = proc.returncode
out, _ = proc.communicate()
return exit_code, out
print("Before")
exit_code, out = run_process([sys.executable, 'slow_starting_server.py', '3'], 4)
print("-----")
print(f"exit_code: {exit_code}")
print("OUT")
print(out)
import subprocess
import time
import os
import psutil
def run_process(*commands):
print(f"Before Popen {os.getpid()}")
processes = []
for command in commands:
proc = subprocess.Popen(command) # This starts runing the external process
print(f"After Popen of {proc.pid}")
psproc = psutil.Process(proc.pid)
print(f"name: {psproc.name()}")
print(f"cmdline: {psproc.cmdline()}")
processes.append(proc)
time.sleep(1.5)
print("Before communicate")
for proc in processes:
proc.communicate()
print("After communicate")
exit_codes = [proc.returncode for proc in processes]
return exit_codes
print("Before run_process", flush=True)
exit_codes = run_process(
['python', 'process.py', '5', '0'],
['python', 'process.py', '4', '1'],
['python', 'process.py', '3', '2'],
)
print("After run_process", flush=True)
print(f'exit code: {exit_codes}', flush=True)
Command line arguments with argparse
Command line arguments
myprog.py data1.xls data2.xls
myprog.py --input data1.xls --output data2.xls
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--input', required=True)
parser.add_argument('--output', help="Some description")
args = parser.parse_args()
print(f"input: {args.input}")
print(f"output: {args.output}")
Modules to handle the command line
- argparse
You would like to allow the user to pass arguments on the command line. For example:
myprog.py server_name name True True
myprog.py --machine server_name --test name --verbose --debug
myprog.py -v -d
myprog.py -vd
myprog.py -dv
myprog.py -v -d -m server_name
myprog.py -vdm server_name
myprog.py file1 file2 file3
myprog.py file1 file2 file3
myprog.py --machine server_name --debug file1 file2 file3
myprog.py file1 file2 file3 --machine server_name --debug
argparse
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--fname') # optional named parameter that requires a value
parser.add_argument('--lname', help="Some description")
parser.add_argument('--max', help='max number of somthing', type=int) # check and convert to integer
parser.add_argument('--verbose', action='store_true') # "flag" no value is expected
parser.add_argument('--color', '-c') # short name also accepted
#parser.add_argument('files', help="filenames(s)") # a required positional argument
#parser.add_argument('dirs', nargs="*") # 0 or more positional
#parser.add_argument('places', nargs="+") # 1 or more positional
#parser.add_argument('ords', nargs="?") # 0 or 1 positional
parser.add_argument('--things', nargs="+") # --things a.txt b.txt c.txt
args = parser.parse_args()
print(f"fname: {args.fname}")
print(f"verbose: {args.verbose}")
print(f"things: {args.things}")
print(f"color: {args.color}")
print(f"max: {args.max}")
if args.verbose:
print("we are making progress....")
Basic usage of argparse
Setting up the argparse already has some (little) added value.
import argparse
parser = argparse.ArgumentParser()
parser.parse_args()
print('the code...')
Running the script without any parameter will not interfere...
$ python argparse_basic.py
the code...
If the user tries to pass some parameters on the command line, the argparse will print an error message and stop the execution.
$ python argparse_basic.py foo
usage: argparse_basic.py [-h]
argparse_basic.py: error: unrecognized arguments: foo
$ python argparse_basic.py -h
usage: argparse_basic.py [-h]
optional arguments:
-h, --help show this help message and exit
The minimal set up of the argparse class already provides a (minimally) useful help message.
Positional argument
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('name', help='your full name')
args = parser.parse_args()
print(args.name)
$ python argparse_positional.py
usage: argparse_positional.py [-h] name
argparse_positional.py: error: too few arguments
$ python argparse_positional.py -h
usage: argparse_positional.py [-h] name
positional arguments:
name your full name
optional arguments:
-h, --help show this help message and exit
$ python argparse_positional.py Foo
Foo
$ python argparse_positional.py Foo Bar
usage: argparse_positional.py [-h] name
argparse_positional.py: error: unrecognized arguments: Bar
$ python argparse_positional.py "Foo Bar"
Foo Bar
Many positional argument
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('files', help='filename(s)', nargs='+')
args = parser.parse_args()
print(args.files)
$ python argparse_positional_many.py
usage: argparse_positional_many.py [-h] files [files ...]
argparse_positional_many.py: error: too few arguments
air:python gabor$ python argparse_positional_many.py a.txt b.txt
['a.txt', 'b.txt']
Convert to integers
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('number', help='the number to take to the square')
args = parser.parse_args()
print(args.number * args.number)
$ python argparse_number.py abc
Traceback (most recent call last):
File "examples/argparse/argparse_number.py", line 10, in <module>
print(args.number * args.number)
TypeError: can't multiply sequence by non-int of type 'str'
Trying to the argument received from the command
line as an integer, we get a TypeError. The same would happen
even if a number was passed, but you could call int()
on the parameter to convert to an integer.
However there is a better solution.
The same with the following
$ python argparse_number.py 23
Traceback (most recent call last):
File "examples/argparse/argparse_number.py", line 10, in <module>
print(args.number * args.number)
TypeError: can't multiply sequence by non-int of type 'str'
Convert to integer
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('number', help='the number to take to the square', type=int)
args = parser.parse_args()
print(args.number * args.number)
$ argparse_type.py abc
usage: argparse_type.py [-h] number
argparse_type.py: error: argument number: invalid int value: 'abc'
We got a much better error message as argparse already found out the argument was a string and not a number as expected.
$ python argparse_type.py 3.14
usage: argparse_type.py [-h] number
argparse_type.py: error: argument number: invalid int value: '3.14'
$ argparse_type.py 23
529
The type
parameter can be used to define the type restriction
and type conversion of the attributes.
Named arguments
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--color', help='The name of the color')
args = parser.parse_args()
print(args.color)
python argparse_named.py --color Blue
Blue
python argparse_named.py
None
Named parameters are optional by default. You can pass the
required=True
parameter to make them required.
Boolean Flags
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--color', help='The name of the color')
parser.add_argument('--verbose', help='Print more data',
action='store_true')
args = parser.parse_args()
print(args.color)
print(args.verbose)
python argparse_boolean.py --color Blue --verbose
Blue
True
python argparse_boolean.py
None
False
Short names
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--color', '-c', help='The name of the color')
parser.add_argument('--verbose', '-v', help='Print more data',
action='store_true')
args = parser.parse_args()
print(args.color)
print(args.verbose)
python argparse_shortname.py -c Blue -v
python argparse_shortname.py -vc Blue
argparse print help explicitely
- print_help
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--age', help='Your age in years', type=float, required=True)
args = parser.parse_args()
if args.age < 0:
parser.print_help()
exit(1)
print(args.age)
Argparse xor - mutual exlucise - only one - exactly one
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--name')
action = parser.add_mutually_exclusive_group(required=True)
action.add_argument('--add', action='store_true')
action.add_argument('--remove', action='store_true')
args = parser.parse_args()
$ python argparse_xor.py
usage: argparse_xor.py [-h] [--name NAME] (--add | --remove)
argparse_xor.py: error: one of the arguments --add --remove is required
$ python argparse_xor.py --add
$ python argparse_xor.py --remove
$ python argparse_xor.py --add --remove
usage: argparse_xor.py [-h] [--name NAME] (--add | --remove)
argparse_xor.py: error: argument --remove: not allowed with argument --add
$ python argparse_xor.py --help
usage: argparse_xor.py [-h] [--name NAME] (--add | --remove)
optional arguments:
-h, --help show this help message and exit
--name NAME
--add
--remove
Argparse argument with default and optional value
-
nargs
-
const
-
Instead of
default
we use theconst
parameter here -
We tell argparse that the value of the parameter is optional by
nargs='?'
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--level', help='Some level', type=int, const=10, nargs='?')
args = parser.parse_args()
print(args.level)
$ python argument_with_optional_value.py
None
$ python argument_with_optional_value.py --level
10
$ python argument_with_optional_value.py --level 20
20
Conditional required parameter with argparse
import argparse
import sys
# Python Argparse conditionally required arguments
print(sys.argv)
main_parser = argparse.ArgumentParser(add_help=False)
main_parser.add_argument('--commit', help='Commit the downloaded data to git', action='store_true')
main_parser.add_argument('--html', help='Generate the HTML report', action='store_true')
main_parser.add_argument('--collect', help='Get the data from the Forem API', action='store_true')
main_args, _ = main_parser.parse_known_args()
#print(main_args)
print(main_args.commit)
print(main_args.html)
print(main_args.collect)
print(sys.argv)
parser = argparse.ArgumentParser(parents=[main_parser])
if main_args.collect:
parser.add_argument('--username', help='The username on the Forem site', required=main_args.collect)
parser.add_argument('--host', help='The hostname of the Forem site', required=main_args.collect)
parser.add_argument('--limit', help='Max number of pages to fetch', type=int)
args = parser.parse_args()
print(args.collect)
if args.collect:
print(args.username)
print(args.host)
print(args.limit)
print(args.html)
print(args.commit)
Exercise: Command line parameters
Take the code from the color selector exercise in the files section and change it so
the user can supply the name of the file where the colors are listed using the
--file filename
option.
If the user supplies an incorrect color name (which is not listed among the accepted colors) give an error message and stop execution.
Allow the user to supply a flag called --force
that will
override the color-name-validity checking and will allow any color name.
Exercise: argparse positional and named
Create a script that can accept any number of filenames, the named parameter --machine
and the flag --verbose
.
Like this:
python ex.py file1 file2 file3 --machine MACHINE --verbose
Other
import argparse
import datetime
def vali_date(text: str) -> datetime.datetime:
#return datetime.datetime.strptime(text, "%Y-%m-%d")
try:
return datetime.datetime.strptime(text, "%Y-%m-%d")
except ValueError:
raise argparse.ArgumentTypeError(f"This {text!r} is not a valid date.")
parser = argparse.ArgumentParser()
parser.add_argument(
"--date",
help = "Date in format YYYY-MM-DD",
required = True,
type = vali_date
)
args = parser.parse_args()
print(args.date)
import argparse
def is_age(age: str) -> float:
try:
new_age = float(age)
except ValueError:
raise argparse.ArgumentTypeError(f"This: {age!r} is not a valid number.")
if new_age < 0:
raise argparse.ArgumentTypeError(f"It must be a non-negative number. We received {age!r} ")
return new_age
parser = argparse.ArgumentParser()
parser.add_argument("--age", type=is_age, required=True)
args = parser.parse_args()
print(args.age)
import getpass
secret = getpass.getpass()
print(secret)
JSON
JSON - JavaScript Object Notation
- json
JSON is basically the data format used by JavaScript. Because its universal availability it became the de-facto standard for data communication between many different languages. Most dynamic languages have an fairly good mapping between JSON and their own data structures. Lists and dictionaries in the case of Python.
Documentation of the Python json library.
Examples:
{% embed include file="src/examples/json/data.json)
JSON dumps
-
dumps
-
Dictionaries and lists are handles
-
Tuples are indistinguishable from lists
-
Always Double-quotes
-
null
instead ofNone
-
No trailing comma
import json
data = {
"fname" : 'Foo',
"lname" : 'Bar',
"email" : None,
"children" : [
"Moo",
"Koo",
"Roo",
],
"fixed": ("a", "b"),
}
print(data)
json_str = json.dumps(data)
print(json_str)
with open('data.json', 'w') as fh:
fh.write(json_str)
{'fname': 'Foo', 'lname': 'Bar', 'email': None, 'children': ['Moo', 'Koo', 'Roo'], 'fixed': ('a', 'b')}
{"fname": "Foo", "lname": "Bar", "email": null, "children": ["Moo", "Koo", "Roo"], "fixed": ["a", "b"]}
dumps
can be used to take a Python data structure and generate a string in JSON format. That string can then be saved in a file,
inserted in a database, or sent over the wire.
JSON loads
- loads
import json
with open('data.json') as fh:
json_str = fh.read()
print(json_str)
data = json.loads(json_str)
print(data)
{"fname": "Foo", "lname": "Bar", "email": null, "children": ["Moo", "Koo", "Roo"], "fixed": ["a", "b"]}
{'fname': 'Foo', 'lname': 'Bar', 'email': None, 'children': ['Moo', 'Koo', 'Roo'], 'fixed': ['a', 'b']}
dump
- dump
import json
data = {
"fname" : 'Foo',
"lname" : 'Bar',
"email" : None,
"children" : [
"Moo",
"Koo",
"Roo",
],
}
print(data)
with open('data.json', 'w') as fh:
json.dump(data, fh)
As a special case dump
will save the string in a file or in other stream.
load
- load
import json
with open('data.json', 'r') as fh:
data = json.load(fh)
print(data)
Round trip
- loads
- dumps
import json
import os
import time
import sys
if len(sys.argv) != 2:
exit("Usage: {sys.argv[0]} NAME")
data = {
'name': [],
'time': [],
}
filename = 'mydata.json'
if os.path.exists(filename):
with open(filename) as fh:
json_str = fh.read()
# print(json_str)
data = json.loads(json_str)
data['name'].append(sys.argv[1])
data['time'].append(time.time())
with open(filename, 'w') as fh:
json_str = json.dumps(data, indent=4)
fh.write(json_str)
Pretty print JSON
import json
data = {
"name" : "Foo Bar",
"grades" : [23, 47, 99, 11],
"children" : {
"Peti Bar" : {
"email": "peti@bar.com",
},
"Jenny Bar" : {
"phone": "12345",
},
}
}
print(data)
print(json.dumps(data))
print(json.dumps(data, indent=4, separators=(',', ': ')))
{'name': 'Foo Bar', 'grades': [23, 47, 99, 11], 'children': {'Peti Bar': {'email': 'peti@bar.com'}, 'Jenny Bar': {'phone': '12345'}}}
{"name": "Foo Bar", "grades": [23, 47, 99, 11], "children": {"Peti Bar": {"email": "peti@bar.com"}, "Jenny Bar": {"phone": "12345"}}}
{
"name": "Foo Bar",
"grades": [
23,
47,
99,
11
],
"children": {
"Peti Bar": {
"email": "peti@bar.com"
},
"Jenny Bar": {
"phone": "12345"
}
}
}
Serialize Datetime objects in JSON
Sort keys in JSON
import json
data = {
"name" : "Foo Bar",
"grades" : [23, 47, 99, 11],
"children" : {
"Peti Bar" : {
"email": "peti@bar.com",
},
"Jenny Bar" : {
"phone": "12345",
},
}
}
print(json.dumps(data, sort_keys=True, indent=4, separators=(',', ': ')))
{
"children": {
"Jenny Bar": {
"phone": "12345"
},
"Peti Bar": {
"email": "peti@bar.com"
}
},
"grades": [
23,
47,
99,
11
],
"name": "Foo Bar"
}
Set order of keys in JSON - OrderedDict
- collections
- OrderedDict
from collections import OrderedDict
import json
d = {}
d['a'] = 1
d['b'] = 2
d['c'] = 3
d['d'] = 4
planned_order = ('b', 'c', 'd', 'a')
e = OrderedDict(sorted(d.items(), key=lambda x: planned_order.index(x[0])))
print(e)
out = json.dumps(e, sort_keys=False, indent=4, separators=(',', ': '))
print(out)
print('-----')
# Create index to value mapping dictionary from a list of values
planned_order = ('b', 'c', 'd', 'a')
plan = dict(zip(planned_order, range(len(planned_order))))
print(plan)
f = OrderedDict(sorted(d.items(), key=lambda x: plan[x[0]]))
print(f)
out = json.dumps(f, sort_keys=False, indent=4, separators=(',', ': '))
print(out)
OrderedDict([('b', 2), ('c', 3), ('d', 4), ('a', 1)])
{
"b": 2,
"c": 3,
"d": 4,
"a": 1
}
-----
{'b': 0, 'c': 1, 'd': 2, 'a': 3}
OrderedDict([('b', 2), ('c', 3), ('d', 4), ('a', 1)])
{
"b": 2,
"c": 3,
"d": 4,
"a": 1
}
Exercise: Counter in JSON
Write a script that will provide several counters. The user can provide an argument on the command line and the script will increment and display that counter. Keep the current values of the counters in a single JSON file. The script should behave like this:
$ python counter.py foo
1
$ python counter.py foo
2
$ python counter.py bar
1
$ python counter.py foo
3
- Extend the exercise so if the user provides the
--list
flag then all the indexes are listed (and no counting is done). - Extend the exercise so if the user provides the
--delete foo
parameter then the counterfoo
is removed.
Exercise: Phone book in JSON
Write a script that acts as a phonebook. As "database" use a file in JSON format.
$ python phone.py Foo 123
Foo added
$ python phone.py Bar
Bar is not in the phnebook
$ python phone.py Bar 456
Bar added
$ python phone.py Bar
456
$ python phone.py Foo
123
- If the user provides
Bar 123
save 123 for Bar. - If the user provides
Bar 456
tell the user Bar already has a phone number. - To update a phone-number the user must provide
--update Bar 456
- To remove a name the user must provide
--delete Bar
- To list all the names the user can provide
--list
Solution: Counter in JSON
import json
import sys
import os
filename = 'counter.json'
if len(sys.argv) != 2:
print("Usage: " + sys.argv[0] + " COUNTER")
exit()
counter = {}
if os.path.exists(filename):
with open(filename) as fh:
json_str = fh.read()
counter = json.loads(json_str)
name = sys.argv[1]
if name in counter:
counter[name] += 1
else:
counter[name] = 1
print(counter[name])
with open(filename, 'w') as fh:
json_str = json.dumps(counter)
fh.write(json_str)
Solution: Phone book
import sys
import json
import os
def main():
filename = 'phonebook.json'
phonebook = {}
if os.path.exists(filename):
with open(filename) as fh:
json_str = fh.read()
phonebook = json.loads(json_str)
if len(sys.argv) == 2:
name = sys.argv[1]
if name in phonebook:
print(phonebook[name])
else:
print("{} is not in the phonebook".format(name))
return
if len(sys.argv) == 3:
name = sys.argv[1]
phone = sys.argv[2]
phonebook[name] = phone
with open(filename, 'w') as fh:
json_str = json.dumps(phonebook)
fh.write(json_str)
return
print("Invalid number of parameters")
print("Usage: {} username [phone]".format(sys.argv[0]))
if __name__ == '__main__':
main()
YAML
YAML - YAML Ain't Markup Language
-
YAML
-
Documentation of the PyYAML library.
Read YAML
- load
- Loader
# A comment
Course:
Language:
Name: Ladino
IETF BCP 47: lad
For speakers of:
Name: English
IETF BCP 47: en
Special characters: []
Modules:
- basic/
- words/
- verbs/
- grammar/
- names/
- sentences/
import yaml
filename = "data.yaml"
with open(filename) as fh:
data = yaml.load(fh, Loader=yaml.Loader)
print(data)
Write YAML
- dump
- Dumper
import yaml
filename = "out.yaml"
data = {
"name": "Foo Bar",
"children": ["Kid1", "Kid2", "Kid3"],
"car": None,
"code": 42,
}
with open(filename, 'w') as fh:
yaml.dump(data, fh, Dumper=yaml.Dumper)
car: null
children:
- Kid1
- Kid2
- Kid3
code: 42
name: Foo Bar
Exercise: Counter in YAML
Exactly like the same exercise in the JSON chapter, but use a YAML file as the "database".
Exercise: Phone book in YAML
Exactly like the same exercise in the JSON chapter, but use a YAML file as the "database".
Solution: Counter in YAML
import sys
import os
import yaml
filename = "counter.yaml"
if len(sys.argv) > 2:
exit(f"Usage: {sys.argv[0]} [NAME]")
counter = {}
if os.path.exists(filename):
with open(filename) as fh:
counter = yaml.load(fh, Loader=yaml.Loader)
if len(sys.argv) == 1:
if counter:
for key, value in counter.items():
print("{key} {value}")
else:
print("No counters were found")
exit()
name = sys.argv[1]
if name not in counter:
counter[name] = 0
counter[name] += 1
print(counter[name])
with open(filename, 'w') as fh:
yaml.dump(counter, fh, Dumper=yaml.Dumper)
Exception handling
0
1
3
def read_and_divide(filename):
print("before " + filename)
with open(filename, 'r') as fh:
number = int(fh.readline())
print(100 / number)
print("after " + filename)
import sys
import module
files = sys.argv[1:]
for filename in files:
try:
module.read_and_divide(filename)
except Exception as err:
print(f" There was a problem in '{filename}'", file=sys.stderr)
print(f" Text: {err}", file=sys.stderr)
print(f" Name: {type(err).__name__}", file=sys.stderr)
print('')
before one.txt
100.0
after one.txt
before zero.txt
There was a problem in 'zero.txt'
Text: division by zero
Name: ZeroDivisionError
before two.txt
There was a problem in 'two.txt'
Text: [Errno 2] No such file or directory: 'two.txt'
Name: FileNotFoundError
before three.txt
33.333333333333336
after three.txt