What is Python?
Simply put, Python is a programming language. It was initially developed by
Guido Van Rossom in the late 1980's in the Netherlands.
Python is
developed as an open source project and is free to download and use as you
wish. On a technical level, Python is a strongly typed language in the sense
that every object in the language has a definite type, and there's generally no
way to circumvent that type. At the same time, Python is dynamically typed
meaning that there's no type checking of your code prior to running it.
This is
in contrast to statically typed languages like C++ or Java where a compiler
does a lot of type checking for you rejecting programs, which misuse objects.
Ultimately, the best description of the Python type system is that it uses duck
typing where an object's suitability for a context is only determined at
runtime.
Python is a general purpose programming language. It's not
intended for use in any particular domain or environment, but instead can be
fruitfully used in a wide variety of tasks. There are of course some areas
where it's less suitable than others, for example, in extremely time sensitive
or memory constrained environments, but for the most part Python is as flexible
and adaptable as many modern languages and more so than most. Python is an
interpreted language. This is a bit of a myth statement technically because
Python is normally compiled into a form of byte code before it's executed.
However, this compilation happens invisibly, and the experience of using Python
is one of immediately executing code without a noticeable compilation phase.
This lack of an interruption between editing and running is one of the great
joys of working with Python. The syntax of Python is designed to be clear,
readable, and expressive. Unlike many popular languages, Python uses whitespace
to delimit code blocks and in the process does away with reams of unnecessary
parentheses while enforcing a universal layout. There are multiple
implementations of the Python language. The original, and still by far the most
common implementation, is written in C. This version is commonly referred to as
C Python. When someone talks about running Python, it's normally safe to assume
that they are talking about C Python, and this is the implementation that we'll
be using for this course. Other implementations of Python include Jython, which
is written to target the Java virtual machine; Iron Python, which targets the
.NET runtime; and pypy, which is written in a specialized subset of Python
called RPython.
There are two important versions of the Python language in
common use right now, Python 2 and Python 3. Python 2 is older and more well
established than Python 3, but Python 3 addresses some known shortcomings in
the older version. Python 3 is the definite future of Python, and you should
use it if at all possible.
Python Overview
Beyond being a programming language, Python comes with a powerful and broad standard library. Part of the Python philosophy is batteries included meaning that you can use Python for many complex real world tasks out-of-the-box with no need to install third-party packages. This is not only extremely convenient, but it means that it's easier to get started using Python using interesting engaging examples, something we aim for in this course. Another great effect of the batteries included approach is that it means that many scripts, even non- trivial ones, can be run immediately on any Python installation. This removes the common annoying barrier to installing software that you face with some languages. The standard library has a generally high level of good documentation. APIs are well documented, and the modules often have good narrative descriptions with quick start guides, best practice information, and so forth. The standard library documentation is always available online at python.org, and you can also install it locally if you need to.
The Read-Eval-Print-Loop or REPL
This Python command line environment is a Read-Eval-Print-Loop. Python will read whatever input we type in, evaluate it, print the result, and then loop back to the beginning. You'll often hear it referred to as simply the REPL. When started, the REPL will print some information about the version of Python you're running, and then it will give you a triple arrow prompt. This prompt tells you that Python is waiting for you to type something. Within an interactive Python session you can enter fragments of Python programs and see instant results. Let's start with some simple arithmetic. 2 + 2 is 4. As you can see, Python reads that input, evaluates it, prints the result, and loops around to do the same again. 6 * 7 is 42. We can assign to variables in the REPL, x = 5, and print their content simply by typing their name. We can also refer to variables and expressions such as 3 * x. As an aside, print is one of the biggest differences between Python 2 and Python 3. In Python 3 the parentheses are required whereas in Python 2 they were not. This is because in Python 3 print is a function call. More on functions later. At this point, we should show you how to exit the REPL. We do this by sending the end of file control character to Python, although unfortunately the means of sending this character varies across platforms. If you're on Windows, press Control+Z to exit. If you're on Mac or Linux, press Control+D to exit. If you regularly switch between platforms and you accidentally press Control+Z on a Unix-like system, you'll inadvertently suspend the Python interpreter and return to your operating system's shell. To reactivate Python making it a foreground process again, simply run the fg command, and press Enter a couple of times to get the triple arrow Python prompt back.
Significant Whitespace in Python
Start your Python 3 interpreter using the Python or Python 3 command for Windows or Unix-like systems respectively. The control flow structures of Python such as for loops, while loops, and if statements are all introduced by statements which are terminated by a colon indicating that the body of the construct is to follow. For example, for loops require a body. So, if you enter for i in range(5), Python will change the prompt to three dots to request you provide the body. One distinctive and sometimes controversial aspect of Python is that leading whitespace is syntactically significant. What this means is that Python uses indentation levels rather than the braces used by other languages to demarcate code blocks. By convention, contemporary Python code is indented by four spaces for each level, so we provide those four spaces and a statement to form the body of our loop, x = i * 10. Our loop body will contain a second statement. So, after pressing Return, at the next three dot prompt we enter another four spaces followed by a call to the built-in print function. To terminate our block, we must enter a blank line into the REPL. With the block complete, Python executes the pending code printing out the multiples of 10 less than 50. Looking at a screen full of Python code, we can see how the indentation clearly matches, and in fact must match the structure of the program. Even if we replace the code by gray lines, the structure of the program is clear. Each statement terminates by a colon, starts a new line, and introduces an additional level of indentation, which continues until a dedent restores the indentation to a previous level. Each level of indent is typically four spaces, although we'll cover the rules in more detail in a moment. Python's approach to significant whitespace has three great advantages. First, it forces developers to use a single level of indentation in a code block. This is generally considered good practice in any language because it makes code much more readable. Second, code with a significant whitespace doesn't need to be cluttered with unnecessary braces, and you never need to have a code standard debate about where the braces should go. All code blocks in Python code are easily identifiable, and everyone writes them the same way. Third, significant whitespace requires that a consistent interpretation must be given to the structure of the code by the author, the Python runtime system, and future maintainers who need to read the code, so you can never have code that contains a block from Python's point of view, but which doesn't look like it from a cursory human perspective. The rules for Python indentation can seem complex, but are straightforward in practice. The whitespace you use can either be spaces or tabs. The general consensus is that spaces are preferable to tabs, and four spaces has become standard in the Python community. One essential rule is never to mix spaces and tabs. The Python interpreter will complain, and your colleagues will hunt you down. You are allowed to use different amounts of indentation at different times if you wish. The essential rule is that consecutive lines of code at the same indentation level are considered to be part of the same code block. There are some exceptions to these rules, but they almost always have to do with improving code readability in other ways, for example, by breaking up necessarily long statements over multiple lines.
Python Culture and the Zen of Python
Many programming languages are at the center of a cultural movement. They have their own communities, values, practices, and philosophy, and Python is no exception. The development of the Python language itself is managed through a series of documents called Python Enhancement Proposals or PEPs. One of the PEPs called PEP 8 explains how you should format your code, and we follow its guidelines throughout this course. It is PEP 8 which recommends we use four spaces for indentation in new Python code. Another of these PEPs called PEP 20 is called The Zen of Python. It refers to 20 aphorisms describing the guiding principles of Python, only 19 of which have been written down. Conveniently, The Zen of Python is never further away than the nearest Python interpreter as it can always be accessed from the REPL by typing import this.
Importing From the Python Standard Library
As mentioned earlier, Python comes with an extensive Standard Library, an aspect of Python often referred to as Batteries Included. The Standard Library is structured as modules, a topic we'll discuss in depth later in the course. What's important at this stage is to know that you gain access to Standard Library modules using the import keyword. The basic form of importing a module is simply the import keyword followed by a space and the name of the module. For example, let's see how we can use the Standard Library's math module to compute square roots. At the triple arrow prompt we type import math. Since import is the statement which doesn't return a value, Python doesn't print anything if the import succeeds and we immediately return to the prompt. We can access the contents of the imported module by using the name of the module followed by a dot followed by the name of the attribute in the module that you need. Like many object-oriented languages, the dot operator is used to drill down into object structures. Being expert Pythonistas, we have inside knowledge that the math module contains a function called square root. Let's try to use it. But how can we find out what other functions are available in the math module? The REPL has a special function, help, which can retrieve any embedded documentation from objects for which it has been provided such as Standard Library modules. To get help, type simply help. We'll leave you to explore the first form for interactive help in your own time. We'll go for the second option and pass the math module as the object for which we want help. You can use the spacebar to page through the help, and if you're on Mac or Linux use the arrow keys to scroll up and down. Browsing through the functions we can see there's a math function for computing factorials. Press Q to exit the help browser and return us to the Python REPL. The Python import statement has an alternative form that allows us to bring a specific function from a module into the current namespace. This is a good improvement, but it's still a little long winded for such a simple expression. A third form of the import statement allows us to rename the imported function. This can be useful for reasons of readability or to avoid a namespace clash.
Scalar Types: int, float, None and boolean
Python comes with a number of built-in data types. These include primitive scalar types like integers, as well as collection types like dictionaries. These built-in types are powerful enough to be used alone for many programming needs, and they can be used as building blocks for creating more complex data types. In this section we'll cover the basic scalar built-in types int, float, None, and bool. We'll provide basic information about these now showing their literal forms and how to create them. We've already seen quite a lot of Python integers in action. Python integers are signed and have for all practical purposes unlimited precision. Integer literals in Python are specified in decimal, but may also be specified in binary with a 0b prefix, up top with a 0o prefix, or hexadecimal with a 0x prefix. We can also construct integers by a call to the int constructor, which can convert from other numeric types such as floats to integers. Note that rounding is always towards 0. We can also convert strings to integers. To convert from base 3, use int and then 10000 as a string comma 3. Floating point numbers are supported in Python by the float type. Python floats are implemented as IEEE-754 double precision floating point numbers with 53 bits of binary precision. This is equivalent to between 15 and 16 significant digits in decimal. Any literal number containing a decimal point or a letter E is interpreted by Python as a float. Scientific notation can be used. So, for large numbers such as the approximate speed of light in meters per second 3 times 10 to the 8, we can write 3e8. And for small numbers like Planck's constant 1.616 times 10 to the -35, we can enter 1.616e-35. Notice how Python automatically switches the display representation to the most readable form. As for integers, we can convert to floats from other numeric or string types using the float constructor from an int and from a string. This is also how we create the special floating- point values nan or not a number and also positive and negative infinity. The result of any calculation involving int and float is promoted to a float. Python has a special null value called None with a capital N. None is frequently used to represent the absence of a value. The Python REPL never prints None results, so typing None into the REPL has no effect. None can be bound to a variable name just like any other object, and we can test whether an object is None by using Python's is operator. We can see here that the response is True, which brings us conveniently onto the bool type. The bool type represents logical states and plays an important role in several of Python's control flow structures as we'll see shortly. As you would expect, there are two bool values, True and False, both with initial capitals, and also a bool constructor, which can be used to convert from other types to bool. Let's look at how it works. For integers, 0 is considered falsey, and all other values truthy. (Typing) We see the same behavior with floats where only 0 is considered falsey. (Typing) When converting from collections such as strings or lists, only empty collections are treated as falsey.
Relational Operators
Bool values are commonly produced by Python's relational operators, which can be used for comparing objects. The relational operators are sometimes called the comparison operators. These include value equality and inequality, less-than, greater-than, less-than or equal to, and greater- than or equal to. Two of the most widely used relational operators are Python's equality and inequality tests, which actually test for equivalence or inequivalence of values. That is two objects are equivalent if one could be used in place of the other. We'll learn more about the notion of object equivalence later in the course. For now, we'll simply compare integers. Let's start by assigning or binding a value to a variable g. We test for equality with a double equals operator or for inequality using the not equals operator. We can also compare the order of quantity using the so- called rich comparison operators less-than, greater-than, less-than or equal to, and greater-than or equal to.
Conditional Statements
Now we've examined some basic built-in types. We'll look at two important control flow structures, which depend on conversions to the bool type, while loops, but first if statements. Conditional statements allow us to branch execution based on the value of an expression. The form of the statement is the if keyword followed by an expression terminated by a colon to introduce a new block. The expression is converted to bool as if by the bool constructor. Let's try this at the REPL. Remembering to indent four spaces within the block, we add some code to be executed if the condition is True followed by a blank line to terminate the block at which point the block will execute because self- evidently the condition is true. Conversely, if the condition is False, the code in the block does not execute. Because the expression used with the if statement will be converted to bool just as if the bool constructor had been used, this form (Typing) is exactly equivalent to this form. (Typing) Thanks to this useful shorthand, explicit conversion to bool using the bool constructor is rarely used in Python. The if statement supports an optional else clause, which goes in a block introduced by the else keyword followed by a colon, which is indented to the same level as the if keyword. To start the else block in this case, we just omit the indentation after the three dots. (Typing) For multiple conditions, you might be tempted to do something like this, nesting if statements. Whenever you find yourself with an else block containing a nested if statement like this, you should consider instead using Python's elif keyword, which is a combined else/if.
Summary
Let's summarize what we've seen. We started by obtaining and installing Python 3 for Windows, Ubuntu Linux, and Mac OS X. We then looked at the Read-Eval-Print-Loop or REPL, and how it allows us to interactively explore Python code. We learned some simple arithmetic operators with plus, minus, multiply, divide, modulus, and the integer division operator with double slash. We discovered we could give objects names with the assignment operator using the equals symbol. We learned how to print objects, and we showed you how to exit the REPL, which is different on Windows with Control+Z or Linux and Mac with Control+D. We showed how Python uses significant indentation of code to demarcate code blocks. Each indent level is usually four spaces. And we told you about Python Enhancement Proposals, the documents which govern the evolution of the Python language. In particular, we briefly looked at PEP 8, which is the Python Style Guide, which we follow in this course, and PEP 20, The Zen of Python, which gives useful advice on writing Pythonic code. We looked at importing Python's Standard Library modules using the import statement. Import has three forms: Import module, from module import function, and from module import function as alias. We used all three of these forms during this course module. We showed how to find and browse help(), particularly useful for discovering the Standard Library. We looked at the four built-in scalar types int, float, None, and bool and showed how to convert between these types and use their literal forms. We looked at the six relational operators equality, inequality, less-than, greater-than, less-than or equal to, and greater-than or equal to. These are used for equivalence testing and ordering. We demonstrated structuring conditional code with if, elif, else structures. We shared iterating with while loops and how to interrupt execution of infinite loops using Control+C, which generates a keyboard interrupt exception. We gave an example of how to break out of a loop using the break statement, which breaks out of the innermost loop onto the first statement immediately following the loop, and along the way we looked at the augmented assignment operators for modifying objects in place such as counter variables. We also looked at requesting text from the user at the console with the built-in input() function. Next time here on Python Fundamentals we'll continue our exploration of Python's built-in types and control flow structures by looking at strings, lists, dictionaries, and for-loops. We'll even be using Python to fetch some data from the web for processing.
Strings and Collections Summary Python has support for universal newlines, so no matter what platform you're using, it's sufficient to use a single slash N character safe in the knowledge that it will be appropriately translated to and from the native newline during I/O. Escape sequences provide an alternative means of incorporating newlines and other control characters into literal strings. The backslashes used for escaping can be a hindrance for Windows file system paths or regular expressions, so raw strings with an R prefix can be used to suppress the escaping mechanism.
Other types such as integers can be converted to strings using the str() constructor. Individual characters returned as one character strings can be retrieved using square brackets with integer zero-based indices. Strings support a rich variety of operations such as splitting through their methods. In Python 3 literal strings can contain Unicode characters directly in the source. The bytes type has many of the capabilities of strings, but is a sequence of bytes rather than a sequence of Unicode codepoints. Bytes literals are prefixed with a lowercase B. To covert between string and bytes instances, we use the encode() method of str and the decode() method of bytes in both cases passing the encoding, which we must know in advance. Lists are mutable, heterogeneous sequences of objects. List literals are delimited by square brackets, and the items are separated by commas. As with strings, individual elements can be retrieved by indexing into a list with square brackets. In contrast to strings, individual list elements can be replaced by assigning to the indexed item. Lists can be grown by appending to them and can be constructed from other sequences using the list() constructor. Dictionaries associate keys with values. Literal dictionaries are delimited by curly braces. The key value pairs are separated from each other by commas, and each key is associated with its corresponding value with a colon. For-loops take items one-by-one from an iterable object such as a list and bind the name to the current item. For-loops correspond to what are called for-each loops in other languages.
Modularity - Summary
- python
code is placed in *.py files called "modules"
- Modules
can be executed directly with
- python
module_name.py
- Brought
into the REPL or other modules with
- import
module_name
- Named
functions defined with the def keyword
- def
function_name(arg1, argn):
- Return
from functions using return keyword with optional parameter
- Omitted
return parameter or implicit return at end returns None
- Use
__name__ to determine how the module is being used
- if
__name__ == "__main__" the module is being executed
- Mode
code is executed excatly once, on first import
- def is
a statement which binds a function definition to a name
- Command
line arguments are accessible through sys.argv
- The
script filename is in sys.argv[0]
- Docstrings
are a standalone literal string as the first statement of a function or
module
- Docstrings
are delimited by triple quotes
- Docstrings
provide help()
- Module
docstrings should precede other statements
- Comments
begin with # and run to the end of the line
- A special comment on the first line beginning #! controls module execution by the program loader.
Objects Summary
- Think
for named references to objects rather than variables
- Assignment
attached a name to an object
- Assigning
from one reference to another puts two name tags on the same object
- The
garbage collector reclaims unreachable objects
- id()
returns a unique and constant identifier
- rarely
used in production
- The is
operator determines equality of identify
- Test
for equivalence using ==
- Function
arguments are passed by object reference
- function
can modify mutable arguments
- Reference
is lost if a formal function argument is rebound
- to
change a mutable argument, replace its contents
- return
also passes by object reference
- Function
arguments can be specified with defaults
- Default
argument expressions evaluated once, when def is executed.
- Python
uses dynamic typing
- we
don't specify types in advance
- Python
uses strong typing
- types
are not coerced to match
- Names
are looked up in four nested scopes
- LEGB
rule: Local, Enclosing, Global, and Built-ins
- Global
references can be read from local scope
- Use
global to assign to global references from a local scope
- Everything
in Python is an object
- This
includes modules and functions
- They
can be treated just like other objects.
- import
and def result in binding to named references
- type
can be used to determine the type of the object
- dir()
can be used to introspect an object and gets its attributes
- The
name of a function or module object can be accessed through its __name__ attribute
- The
docstring for a function or module object can be accessed through
its __doc__ attribute.
- Use
len() to measure the length of string
- You
can multiply a string by an integer
- Produces
a new string with multiple copies of the operand
- This is called the "repetition" operation
Collection Summary
- Tuples
are immutable sequence types
- literal
syntax: optional parentheses around a comma separated list
- Single
element tuples must use trailing comma
- Tuple
unpacking - return values and idomatic swap
- Strings
are immutable sequence types of Unicode codepoints
- string
concatenation is most efficiently performed with join() on an empty
separator
- The
partition() method is a useful and elegant string parsing tool
- The
format() method provides a powerful way of replacing placeholders with
values.
- Ranges
represent integer sequences with regular intervals
- Ranges
are arithmetic progression
- The
enumerate() function is often a superior alternative to range()
- Lists
are heterogeneous mutable sequence types
- Negative
indexes work backwards from the end.
- Slicing
allows us to copy all or part of the list.
- The
full slice is a common idiom for copying lists, although the copy()
method and list() constructor are less obscure.
- List
repetition is shallow.
- Dictionaries
map immutable keys to mutable values
- Iteration
and membership testing is done with respect to the keys.
- Order
is arbitrary
- The
keys(), values() and items() methods provide views onto different aspects
of a dictionary, allowing convenient iteration.
- Sets
store and unordered collection of unique elements
- sets
support powerful and expressive set algebra operations and predicates.
- Protocols
such as iterable,sequence and container characterise the collections
Exception handling Summary :
- Raising
an exception interrupts normal program flow and transfers control to an
exception handler.
- Exception
handlers defined using the try...except construct
- try
blocks define a context for detecting exceptions.
- Corresponding
except blocks handle specific exception types.
- Python
uses exceptions pervasively.
- Many
built-in language features depend on them.
- except
block can capture an exception, which are often of a standard type.
- Programmer
errors should not normally be handled.
- Exceptional
conditions can be signalled using raise.
- raise
without an argument re-raises the current exception.
- Generally
do not check for TypeErrors.
- Exception
objects can be converted to strings using str().
- Use
the try...finally construct to perform cleanup actions.
- may
be in conjunction with except blocks.
- Output
of print() can be redirected using the optional file argument.
- Use and and or
for combining boolean expressions.
- Return
codes are too easily ignored.
Comprehensions, Generators & Iterables Summary
Comprehensions
Comprehensions
- Comprehensions
are a concise syntax for describing lists, sets and dictionaries.
- Iterables
are objects over which we can iterate item by item.
- We
retrieve an iterator from an iterable object using built-in iter()
function.
- Iterators produce items one-by-one from the underlying iterable series each time they are passed to the built-in next() function.
Generators
- Generator
functions allow us to describe series using imperative code.
- Generator
functions contain at least one use of the yield keyword.
- Generators
are iterators. When advanced with next() the generator starts or resumes
execution up to and including the next yield.
- Each
call to a generator function creates a new generator object
- Generators
can maintain explicit state in local variables between iterations.
- Generator
expressions have a similar syntactic form to list comprehensions and allow
for a more declarative and concise way of creating generator objects.
Classes Summary
- All
types in Python have a 'class'
- Classes
define the structure and behaviour of object
- Class
is determined when object is created
- normally
fixed for a lifetime
- Classes
are the key support for OOP in Python.
- Classes
defined using the class keyword followed by CamelCase
name
- Class
instances created by calling the class as if it were a function
- Instance
methods are functions defined inside a class
- should
accept an object instance called self as the first parameter
- Methods
are called using instance.method()
- The
optional __init__() method initialized new instances
- if
present, the constructor calls __init__()
- __init__() is
not the constructor
- Arguments
passed to the constructor are forwarded to the initializer
- Instance
attributes are created simply by assigning to them
- Implementation
details are denoted by a leading underscore
- Class
invariants should be established in the initializer
- Methods
can have docstrings, just like regular functions
- Classes
can have docstrings
- Even
within an object method calls must be preceded with self
- You
can have as many classe and functions in a module
- Polymorphism
in Python is achieved through duck typing
- Polymorphism
in Python does not use shared base classes or interfaces
- All
methods are inherited, including special methods like the initializer
Files and resource management Summary
- Files
are opened using the built-in open() function which accepts a file mode to
control read/write/append behaviour and whether the file is to be treated
as raw binary or encoded text data.
- For
text data you should specify a text encoding.
- Text
files deal with string objects and perform universal newline translation
and string encoding.
- Binary
files deal with bytes objects with no newline translation or encoding.
- When
writing files, it's our responsibility to provide newline characters for a
line breaks
- Files
should always be closed after use.
- Files
provide various line-oriented methods for reading, and are also iterators
which yield line by line.
- Files
are context managers and the with-statement can be used with context
managers to ensure that clean up operations , such as closing files are
performed.
Maintainable code summary
- unit
test is a framework for developing reliable automated tests
- You
define test cases by subclassing from unittest.Testcase
- unittest.main()
is useful for running all the tests in a module.
- setUp()
and tearDown() run code before and after each test method.
- Test
methods are defined by creating method name start with test_
- TestCase.assert..
methods make a test method fail when the right conditions aren't met.
- Use
TestCase.assertRaises() in a with-statement to check that the right
exceptions are thrown in a test
- Python's
standard debugger is called PDB
- PDB is
a standard command line debugger
- pdb.set_trace()
can be used to stop program execution and enter the debugger.
- Your
REPL's prompt will change to Pdb when your r in the debugger.
- You
can access PDB's built-in help system by typing help
- Use
"python -m pdb
No comments:
Post a Comment
Note: only a member of this blog may post a comment.