AI News, Difference between revisions of "Foundations of Computer Science"

Difference between revisions of "Foundations of Computer Science"

Have you ever wondered what computing is and how a computer works?

If we attempt to understand the concepts by studying the technologies that result from research in computing and computers, it can be overwhelming as technologies can be very complicated and constantly changing.

Another approach is to try to strip away all unnecessary complexity and focus on the fundamental ideas and underlying principles.

Because they are fundamental they should help develop a deeper understanding of discipline, for instance, we study general biology to get a big picture of biology.

Due to the abstract nature of the material we will use lots of analogies and metaphors to illustrate concepts to draw new concepts into our frame of understanding.

In this book we will try to unpack and illustrate big ideas, such as the nature of computing, representations of information, algorithms, abstraction, recursion, and computability, and use modern technologies to .

In this course, we try to focus on computing principles (big ideas) rather than computer technologies, which are tools and applications of the principles.

Computing is defined by a set of principles or ideas, which underlies a myriad of teachnologes that are created based on the principles.

In the second half of the course, we will study various technologies to demonstrate the power of computing and how principles are applied.

In addition to principles of computing and technologies there are practices of computing - what professionals do to advance computing.

As professionals in the field of computing we need to know the two ends and everything in the middle - the practices (activities and skills that make computing useful and effective).

One of the big ideas of computing is that information processes can be carried out purely mechanically via symbol manipulation.

Toward the end of the book we will see this is true for all modern computers - digital computers manipulate two symbols (zero and one) blindly according to instructions.

We can define a procedure P that takes two symbols ('a' through 'j') as the input and produces two symbols in the same set as the output.

Internally, the procedure uses the first input symbol to find a row that starts with the same symbol, then uses the second input symbol to find a column with the same symbol at the top, and then report/return the symbols at the cross point.

It is not hard to imagine such a table lookup procedure can be done purely mechanically (blindly) by a simple agent (e.g.

For example, if the symbols 'a' through 'j' represent quantities of 0 through 9 respectively, this procedure performs single decimal digit addition.

For instance, p(d, f) = p(3, 5) = ai = 08, which is the correct result of 3+5.

Now that we have a simple procedure P that can instruct a simple agent to add two single digit decimal numbers, we can devise a new procedure P1 that can add three single-digit decimal numbers as shown in the following chart.

The new procedure P1 employs three instances of procedure P to add three decimal digits and return two digits as the result.

We can view the procedures as machines with inputs and outputs and the lines are pipes that allow the symbols to go from one place to another place.

Note that the dotted rectangle represents the new procedure P1 made up of instances of P and the answer given by P1 for the sample inputs is correct.

Again the symbols used in the process can be any set of symbols because internally simple table lookups are performed.

P2 uses P1 to add two double-digit numbers, in fact we can simply add more P1s to the design to deal with any numbers of digits.

If we follow the same line of reasoning, it is not hard to imagine we can create increasingly more complex procedures to instruct the simple machine to do progressively more intelligent things, such as

In summary, from this example we can see that simple symbolic operations can be assembled to form lager procedures to perform amazing activities through computational processes.

If we can represent abstract ideas as symbols (as we represent abstract quantities as concrete numbers) and device procedures to manipulate the symbols according to the relations among the ideas we can model reasoning as computational processes.

This is what computer science is fundamentally about - information processes with two essential components: representations and a sequence of rules for manipulation of the representations.

The machine that carries out such processes does not need to know the meaning of the symbols and why the process yields correct results.

As an example, you can read about a mechanical computer (difference engine) designed by Charles Babbage that can tabulate polynomial functions: Richard Feynman used another similar analogy (file clerk) to explain how computers work from the inside out in his Computer Heuristics Lecture (1 hour 15 mins):

A computer's ability to perform amazing tasks depends on its ability to manipulate symbols according to well defined rules.

Gottfried Leibniz (1646-1716), a German philosopher, is considered the first person to dream of reducing reasoning to calculation and building a machine capable of carrying out such calculations.

His observed that in arithmetic we represent abstract quantities using symbols and manipulate the symbols to get useful results according to rules.

He dreamed that we could represent abstract ideas using symbols and reason with the ideas according to the logics between the ideas via similar concrete symbol manipulation as we do in arithmetic.

Such manipulations give use correct results not because whoever does the manipulation is intelligent but because the rules of manipulation mirrors relationships between quantities and logics between ideas.

A computer is fundamentally a physical device that can manipulate symbols following very simple logic rules.

Computer science is fundamentally about the information process (dealing with abstract ideas) that takes place through symbol manipulation, which follows a recipe (a set of rules).

No wonder so many computer programming books are called cookbooks :) In computer science we study how to represent information and how to design and apply algorithms to get meaningful results.

The engineering disciplines we try to apply in the software development process to produce quality software is called software engineering.

Practices, sometimes called 'know-hows', define someone's skill set and the level of competency: beginner, competent, and expert.

Programming an integral part of computer science because it allows us to explore abstract ideas in computer science in concrete ways.

In this course we will program in a very high-level graphical programming environment to explore ideas in computer science.

If we can represent information using symbols and know how to process the symbols and interpret the results properly we can get valuable new information.

Google published its self-driving car video on March 28 2012 to showcase its technology in building autonomous cars.

For example, we receive information when we read a book, listen to a story, watch a movie, or dream a dream.

We give information when we write an email, draw a picture, act in a show or give a speech.

For instance, a conversation on the phone communicates information but the information is represented by sound waves, electronic signals and etc.

Therefore before any information can be processed or communicated it must be quantified/digitalized - a process that turn information into (data) representations using symbols.

Shannon is considered the father of information theory because he is the first person who studied and built mathematical models for information and communication of information.

His seminal paper “A mathematical theory of communication” (1948) changed our view of information laying the foundation for the information age.

Shannon discovered that the fundamental unit of information is a yes or no answer to a question or one bit with two distinct states, which can be represented by only two symbols.

He also founded the design theory of digital computer/circuit by proving that propositions of Boolean algebra can be used to build a 'logic machine' capable of carrying out general computation (manipulation of two types of symbols).

The external representation is used for communication between human and computers, but internally all information is stored and processed using a more unified representation.

Everything we see on a computer (monitor) screen, whether it is text, image, or motion picture, is a representation of certain information.

Computers also represent information externally using sound and other media, such as touch pad for the blind to read text.

This makes it easy to represent bits physically - any device capable of having two distinct states works, e.g.

With ten fingers humans conveniently adopted the base ten (decimal) numbering system, which requires ten different symbols, for instance the arabic numerals uses 0 through 9.





In fact we commonly use other base systems to represent quantities of different nature: base 7 for drays in a week, base 60 for minutes in an hour, 24 for hours in a day, 16 for ounces in a pound, and etc.

It is not hard to imagine base 2 (two symbols) is the simplest base system because with less than two symbols we can not represent change therefore no information.

All base systems work in the same way - the right most digit represent quantity of the base raised to the zeroth power (always 1), and each digit to the left represents a quantity that is base times larger than the one represented by the digit immediately to the right.



When we use different base systems it is necessary indicate the base as the subscript to avoid confusion, for example it becomes clear that



The computer science unplugged organization produced an class activity to demonstrate the idea of binary numbers using a deck of cards.



In studying computing we often need to convert between decimal representation, which we are most familiar with, and binary representation, which is used internal by computers.

Converting the binary representation of a non-negative integer to its decimal representation is straight-forward process - summing up the quantities each binary digit represents yields the result.








by dividing 23 by 2, writing down the remainder in the third column, and repeating the same process on the quotient till it becomes zero.

In addition to numbers bits are used internally to represent many other types of information or external representations of information.

piece of text can be viewed as a stream of symbols can be represented/encoded as a sequence of bits resulting in a stream of bits for the text.


perceived image is the result of light beams physically coming into our eyes and triggering nerves to send signals to our brain.

In computing an image is simulated by a rectangular grid of dots (pixels - picture element), each of which has a particular color.

In fact, the computer screen use such a grid of pixels to display images and text, each symbol of which is also represented by a grid of pixels externally but internally each symbol is represented as an ASCII or Unicode code.

In addition the encoding of an image includes other meta data, such as the size of the image, the encoding standard, and the date time when it is created.

It is not hard to imagine that videos can be encoded as series of image frames with synchronized audio tracks also encoded using bits.

For instance, if you were to represent a decimal number by writing it down on a piece of paper the size of the paper and the size of the font limit how many digits you can put down.












Note that they are all powers of twos because they are limited by the number bits available for the counting (representing the counts):







For example, one third can never be represented precisely by a decimal format with a fractional part because there will be an infinite number of threes after the decimal point.




The following video demonstrates how limits in the binary representation can cause numerical errors in computing precisely because computers use binary representation internally: This is a limit that is hard to overcome completely but there are practical means to deal with such round-off error so that the computational results are still useful.

We learned that computers use the binary system internally to represent everything as sequence of bits - zeros and ones.

Chapter 1 of the Blown to Bits book talks about the digital explosion of bits as the result of the innovations in computing and technologies, which enable us to turn information into bits and shared them with unprecedented speed.

We will learn that information processes start with conceptual solutions to problems and then can be implemented (coded) in ways understandable to machines.

You are already familiar with many algorithms, such as tying your shoes, making coffee, send an email, and cooking a dish according to a recipe.

For an algorithm to be useful, it must be correct - the steps must be logical and specific for machines to carry out - and efficient - it must finish in a reasonable amount of time.

Alan Turning is the first person who studied algorithms mathematically by creating a universal machine model, later called Turing machine.

He also proved that computation is unavoidable in circumstances - we will have to perform the computation to the result, which separates computing from mathematics (the birth of computer science).

The turing machine can represent/encode information in a standard form and interpret and updates the representation according to rules (algorithms), which is part of the representation.

For instance, we can view each algorithm as a state machine: an algorithm always starts with a state - represented by a data representation of the input and the internal information - and move through a number of states to its final state as the result of performing operations prescribed in the algorithm.

When the number of possible initial states approach infinity the same algorithm can generate potentially infinite number of computation.

In computing we deal with representations of information (data), so a step can be adding two integers and storing the result to a variable.

Machine languages are written in instructions consisting of zeros and ones (binary bits) because computers are fundamentally machines that can manipulate two types of symbols.

Each different type of machine is designed to understand its own native language -- patterns of zeros and ones -- because they can be very different computing hardware.

Normally we write programs to express our algorithms in high level languages - languages that are close to our natural language, e.g.

Then we use tools (compilers and interpreters) to translate our programs in higher level languages to machine languages, which is analogous to using a personal interpreter when we travel abroad and don't understand the native language.

High level languages hides the differences between machines to allow us to write programs in a machine independent way, which is a huge time saver.

So the steps or units of work must be defined in terms of a higher abstraction - a set of common data structure, operations and control structures that all languages support.

By creating algorithms conceptually using abstractions allows us humans to think on a higher level closer to the problem domain we know.

The following diagram shows that the same algorithm can be implemented in a variety of programming languages and the result programs can be executed on different machine models (maybe after some translation).

We know how to carry it out manually, but we would certainly not solve the problem using the process expressed in psedo code.

The psedo code, even though in natural language, must use the aforementioned constructs (data structures, operations, and control structures) that are common to computers.

Even we, as human beings, will not be able to scan a long list, say a million numbers, and find the largest number at a glance.

The algorithm we write must be expressed in terms of what a computer can do and are scalable to inputs (data sets) of arbitrary sizes.

In fact, it makes little difference to a computer whether the list has three numbers or three million numbers.

The top-most conditional defines a repetition (loop) because their an unconditional branching back to the conditional as expressed in the arrow with no label.

In summary constructing and studying algorithms allows us to study (algorithmic) solution to problems in a language and computing environment neutral way.

We could use the 'finding largest number' algorithm as a single block to form more complex algorithms, but we need to keep in mind this block is not a unit of work as the number of steps involved depends on the input size (the length of the list).

We will revisit this topic in the future when we talk about functional decomposition (to keep algorithms simple) and algorithm complexity analysis (count the cost of algorithms).

For instance, if we have a list of students we can search a student by last name, birthdate, or shoe size, which are search keys.

However, if the list is ordered by the search key we can use a much better algorithm by taking advantage of this ordered property of the data.

For instance, home phone numbers in a phone books are usually ordered by owner's last names and business phone numbers are ordered by business types.

If the list of numbers are random but ordered increasingly we can always guess the target number is in the middle of the search range (initially the whole list).

If we are not lucky, we can eliminate half of the candidates - if the number in the middle is greater than the search target the new search range will be the first half of the previous search range, otherwise the second half.

When solving a Rubik's cube (of any size), it is important to know the step by step instructions or algorithm to reach an efficient solution.

The top down approach to design is starting by examining the full problem or one way to think of it is to look at the big/whole picture first.

The general will breakdown a mission by assigning each soldier with a specific task to complete that in turn contributes to a critical part of the overall mission.

Using this method ensures that the smallest unit has been successfully tested and therefore, when you start solving or implementing the next sub-solution that it will work due to the previous layers working successfully.

The following four basic logic structures describe the type of blocks used for creating algorithms:

According to Wikipedia, a constructed language 'is a language whose phonology, grammar, and vocabulary has been consciously devised for human or human-like communication, instead of having developed naturally'.

Computing algorithms can be described/defined in terms of the common units of work that computers can do so that they are neutral to programming languages and the execution environment (computers).

If we liken a computer to a car, programmers are drivers who can make the car do different things (to a certain extent) and users are like passengers taking advantage of the things the car can do.

This type of errors are easy to fix, in fact most modern program editors can detect and warn us about such errors.

Another more subtle bug may cause the computer program to never finish, known as the infinite loop, which is obviously not what we wanted.

If the actual output matches the desired output the program passes the test, otherwise their is a bug in the program or in the test (tests can be buggy too).

When a program becomes larger and more complex the number of tests need to cover all possible cases grows very fast.

There are usually always multiple ways to solve the same problem and, therefore, multiple algorithms to make computers solve the problem for us.

In computing we care the most about the solution speed because a fast algorithm can solve larger problem or more problems in the same amount of time.

One possible approach is to implement the algorithm, run the result program, and measure its execution time - elapsed time between the start and the end of a program run.

Secondly, to run two programs to compare their execution time we must subject them to the same input (a.k.a a workload, e.g a list of one million numbers to be sorted) and the size of the input is never ideal.

The amount of food needed will surely magnify the difference - as we increase the amount of ingredients the lengthy recipes will take even longer to follow.

If a recipe involves beating eggs can instruct the cook to break each egg and scramble it individually or break all eggs first and scramble them together.

Recall that algorithms must be defined in terms of the units of work (steps) computers can perform so that it is straightforward to implement them in programming languages.

The way the steps are ordered and organized in algorithms can significantly affect the execution time of the programs implement the algorithms.

In this approach we take an algorithm described in pseudo code or flow chart, count the number of steps (units of work), which is always a function of the input size.

In the aforementioned example recipe the time it takes to follow the first method (break each egg and scramble it individually) is directly proportional to the number of eggs involved.

Instead of measuring the steps taken for a particular input size, we focus on the relationship function between the number of steps and the input size, which shows the pattern in which the amount of work (cost) grows as the input size increases.

Then, we apply asymptotic analysis, which compare functions as inputs approach infinity, to simplify the functions because as the input size approaches infinity the difference between the units of works disappears (we can assume breaking an egg and scrambling it take the same amount of time) and the cost of most 'complex' part of the task will dominate (a part that repeats a step 10 time will dwarf any step that is only done once) the total cost.

For example, in a recipe we have a one quick step (takes 0.0001 seconds per serving) that repeats 10 times and a slow step (takes 10 seconds per serving) doesn't repeat, an amount of serving of N (input size) would cost

In asymptotic analysis we can ignore the slow step because its contribution to the total cost is negligible when N approaches infinity.

With simplify growth functions we can put them into categories considering algorithms in each category to have similar performance characteristics.

This type of analysis is not completely precise but it can be very useful in studying the nature of algorithms and predicting their performances.

The following script implements algorithm 1 using a block that iterates through all the numbers in the range and adding them to the sum one at a time.

The same problem can be solved using algorithm 2, which use a function to calculate the result as shown in the following script and the report block.

If you run the programs and try different input size you will observe that as you increase the input size the execution time of the first program increases steadily whereas the second program shows not change in execution time.

The first algorithm loops through the numbers to get the sum, therefore the relationship function between cost (the number of steps - additions) and the input size is

We assign algorithms with this type of growth function into the leaner algorithm to the leaner this type of functions linear time category denoted by

The second algorithm always takes the same amount of time because it performs the same set of operations regardless of the input size.

When we analyze algorithms we focus on worst cases because the performance of an algorithm is dependent from the actual input and we should not rely on luck!

because in the worst case (all numbers are unique) the algorithm has to compare each number to all the other numbers in the list before it arrive at the answer.



If you are interested you can reproduce and run the following script (an implementation of the algorithm) to verify the predicted performance characteristics of the algorithm.

This example is a little contrived but it demonstrate a new performance category and is very useful when we discuss encryption techniques.

Assuming outputting a number is the unit of work (it takes the same time output a number) the cost to generate all one digit numbers is 10.





One technique we use to keep our algorithms and programs simple is abstraction, which is an idea widely used in many fields such as art, math and engineering.

The internal working of a car is unnecessary detail to drivers unless they are race car drivers who need this type knowledges to drive the car most efficiently.

This car interface extracts common car features (what a driver needs to know to drive a car) from all kinds of cars.

We have learned that algorithm design is a integral part of problem solving using computing and a first step in programming.

you can implement an algorithm as a block, which then can be used anywhere in your script as long as you can call the block with a proper sequence of parameters according to the interface.

Layering separates the functional units (blocks) into layers to separate concerns and simply interactions patterns to make the complexity more manageable.

On the contrary if any arbitrary interaction is allowed we may end up with tightly coupled system as shown in the following figure.

The machine was built through abstraction - we construct larger building blocks using smaller (more elementary) ones and treat each block as a unit ignoring the internal details.

In order to make a sprite to draw different equal-lateral shapes we can create a block for drawing each type of shapes such as triangles, squares, pentagons, and etc.

The draw triangle block repeat three times to draw each side with a turn of 120 degrees.

Through this example, we have demonstrated defining blocks to abstract away details of a task and generalizing a solution to solve more problems.

Similarly, a directory tree of a file system on a computer and an ancestry tree genealogy exhibit a similar pattern.

can be calculated from 4!, which is 4 times 3!, which is 3 time 2!, which is 2 times 1!, which is 1 by definition.

If the problem we are solving is self-similar - solving the whole problem is a matter of solving the parts that are similar to the whole - the solution we are defining for the whole can be used to solve the subproblems(recursion).

To design a recursive solution we practice wishful thinking - as we describe the solution to whole problem we wish it is fully defined and can be used to solve the smaller subproblems.

To program recursively, we describe the solution to a problem as a function/procedure/block, in which we break the bigger problems into smaller ones and use the function we defining to solve the smaller problems.

By the time our program reaches all base cases, we would have solved the whole problem because all the subproblems are solved including the problem we start with.

If both parts exists and are structured properly, the algorithm (function) can solve problems of any size by asking clones of itself to solve partial problems.

Recursive problem solving is a powerful idea because it simples our thinking: if you can define a problem recursively you can solve it recursively.

To search one half of the ordered list is just a smaller version/part of the overall problem so we can apply the same algorithm wishing the algorithm is fully defined.

Next, if the middle item is greater than the target the first half of the list is searched, else, we search the second half of the list.

The image previously shown uses a helper block called 'middle between' which is used to find the middle index when given the first and last index of a sorted list (see below).

The helper block below shows the target and list being passed in as input into the 'binary search for' block (see below).

The actual recursive solution is implemented when the recursive search block detects the first and last index of a sorted list.

The 'binary search for' block allows users to call the recursive search without the user being required to provide the first and last indices.

higher order functions offer a more powerful ways to generalize solutions to problems by allowing blocks to take blocks as parameters and returning a block as a return value.

An example higher order function in math is the derivative function which takes a function as the input and produces another function (the derivative of the first function) as the output.

In computer science, a map function takes an arbitrary function and a data set (e.g.

Another example is the reduce (or fold) function, which takes a input function and data set and produces the aggregation of all items in the data set using the function.

For instance, if the input function is the addition the reduce function returns the sum of all items in the data set as the output.

It doesn't simply programming because someone has to write this map block (check the source code to see how complicated it is), but it makes programmers happier because it keeps the thinking part simpler.

Note that the reduce (combine with) function can take any function with two input parameters to aggregate a list of values to a single value.

By using higher order functions we can create generalized solutions that can be customized to solve a larger set of problems.

In this block two reporter blocks are taken in as parameters, which are then used to form the report value - a new block.

To use the composed function we can call the compose function with the two input functions and use the 'call' block to execute the composed block against an input value as shown in the following script.

With this 'compose' block we define new functions by combining existing functions on the fly - when we need them.

In 1936 Alan Turing proved that a general algorithm to solve the halting problem for all possible program-input pairs cannot exist, i.e.

More specifically, the halting problem is undecidable because it is the simplest type of question that answers a yes or no (decision) question: whether a program will eventually halt (stop) given an input.

Obviously it is infeasible to actually run the program with the input to see whether it will halt because it may take forever if there is an infinite loop in the program.

All proofs by contradiction start with an assumption that the proposition we are trying to proof is false, follow a logical sequence of valid steps, and arrive at a conclusion that is clearly false or contradicts, which is also false because a statement can be either true or false but never both.

If we assume the halting problem is decidable/solvable we should be able to design an algorithm and implement it as a program that solve the problem for us.

The program should take a program and the input to the program as a input parameters and return the answer - whether the program will halt or not on the input.

takes a block (program) as input date and an interpreter program takes the source code of a program as data and runs the program.

On the contrary if the best known algorithm we know takes a long time to solve a problem, it is hard because computers cannot solve it fast.

If the big-O notation of a category contains only polynomial terms, the problems solvable using algorithms in this category are called P problems (Polynomial time solutions exist), such as



For example, integer factorization (or prime decomposition) problem has no known polynomial time solution but given an answer we can verify it very quickly by simply multiplying the factors and comparing the result to the integer.

Collectively we call problems that take too long to solve intractable problems, which include problems with best algorithm in exponential time (






Obviously P is a subset of NP because NP is defined only by polynomial answer verification time and being able to solve a problem in polynomial time (P) certainly qualifies for that.

All problems in this category are NP problems sharing one special property - ALL other NP-complete problems can be translated/reduced to each of the NP-complete problems in polynomial time.

Most encryption algorithms are computationally secure because breaking them are NP-problems - there is no known efficient polynomial time solution.

Computational complexity theory

Computational complexity theory is a branch of the theory of computation in theoretical computer science that focuses on classifying computational problems according to their inherent difficulty, and relating the resulting complexity classes to each other.[1]

A computational problem is understood to be a task that is in principle amenable to being solved by mechanical application of mathematical steps, such as an algorithm, which is equivalent to stating that the problem may be solved by a computer.

The theory formalizes this intuition, by introducing mathematical models of computation to study these problems and quantifying their computational complexity, i.e., the amount of resources needed to solve them, such as time and storage.

Other measures of complexity are also used, such as the amount of communication (used in communication complexity), the number of gates in a circuit (used in circuit complexity) and the number of processors (used in parallel computing).

A key distinction between analysis of algorithms and computational complexity theory is that the former is devoted to analyzing the amount of resources needed by a particular algorithm to solve a problem, whereas the latter asks a more general question about all possible algorithms that could be used to solve the same problem.

In turn, imposing restrictions on the available resources is what distinguishes computational complexity from computability theory: the latter theory asks what kind of problems can, in principle, be solved algorithmically.

The instance is a number (e.g., 15) and the solution is 'yes' if the number is prime and 'no' otherwise (in this case, 15 is not prime and the answer is 'no').

To further highlight the difference between a problem and an instance, consider the following instance of the decision version of the traveling salesman problem: Is there a route of at most 2000 kilometres passing through all of Germany's 15 largest cities?

The quantitative answer to this particular problem instance is of little use for solving other instances of the problem, such as asking for a round trip through all sites in Milan whose total length is at most 10 km.

For example, integers can be represented in binary notation, and graphs can be encoded directly via their adjacency matrices, or by encoding their adjacency lists in binary.

Even though some proofs of complexity-theoretic theorems regularly assume some concrete choice of input encoding, one tries to keep the discussion abstract enough to be independent of the choice of encoding.

A decision problem can be viewed as a formal language, where the members of the language are instances whose output is yes, and the non-members are those instances whose output is no.

The formal language associated with this decision problem is then the set of all connected graphs — to obtain a precise definition of this language, one has to decide how graphs are encoded as binary strings.

function problem is a computational problem where a single output (of a total function) is expected for every input, but the output is more complex than that of a decision problem—that is, the output isn't just yes or no.

For instance, in the problem of finding whether a graph is connected, how much more time does it take to solve a problem for a graph with 2n vertices compared to the time taken for a graph with n vertices?

Since the time taken on different inputs of the same size can be different, the worst-case time complexity T(n) is defined to be the maximum time taken over all inputs of size n.

Turing machines are not intended as a practical computing technology, but rather as a general model of a computing machine—anything from an advanced supercomputer to a mathematician with a pencil and paper.

Furthermore, it is known that everything that can be computed on other models of computation known to us today, such as a RAM machine, Conway's Game of Life, cellular automata or any programming language can be computed on a Turing machine.

Since Turing machines are easy to analyze mathematically, and are believed to be as powerful as any other model of computation, the Turing machine is the most commonly used model in complexity theory.

Many types of Turing machines are used to define complexity classes, such as deterministic Turing machines, probabilistic Turing machines, non-deterministic Turing machines, quantum Turing machines, symmetric Turing machines and alternating Turing machines.

A non-deterministic Turing machine is a deterministic Turing machine with an added feature of non-determinism, which allows a Turing machine to have multiple possible future actions from a given state.

One way to view non-determinism is that the Turing machine branches into many possible computational paths at each step, and if it solves the problem in any of these branches, it is said to have solved the problem.

Clearly, this model is not meant to be a physically realizable model, it is just a theoretically interesting abstract machine that gives rise to particularly interesting complexity classes.

The non-deterministic Turing machine has very little to do with how we physically want to compute algorithms, but its branching exactly captures many of the mathematical models we want to analyze, so that non-deterministic time is a very important resource in analyzing computational problems.

For a precise definition of what it means to solve a problem using a given amount of time and space, a computational model such as the deterministic Turing machine is used.

The time required by a deterministic Turing machine M on input x is the total number of state transitions, or steps, the machine makes before it halts and outputs the answer ('yes' or 'no').

Other complexity measures used in complexity theory include communication complexity, circuit complexity, and decision tree complexity.

The best, worst and average case complexity refer to three different ways of measuring the time complexity (or any other complexity measure) of different inputs of the same size.

If we assume that all possible permutations of the input list are equally likely, the average time taken for sorting is O(n log n).

To classify the computation time (or similar resources, such as space consumption), one is interested in proving upper and lower bounds on the maximum amount of time required by the most efficient algorithm solving a given problem.

But bounding the computation time above by some concrete function f(n) often yields complexity classes that depend on the chosen machine model.

x is any binary string} can be solved in linear time on a multi-tape Turing machine, but necessarily requires quadratic time in the model of single-tape Turing machines.

If we allow polynomial variations in running time, Cobham-Edmonds thesis states that 'the time complexities in any two reasonable and general models of computation are polynomially related' (Goldreich 2008, Chapter 1.2).

This forms the basis for the complexity class P, which is the set of decision problems solvable by a deterministic Turing machine within polynomial time.

For the complexity classes defined in this way, it is desirable to prove that relaxing the requirements on (say) computation time indeed defines a bigger set of problems.

Having deduced such proper set inclusions, we can proceed to make quantitative statements about how much more additional time or space is needed in order to increase the number of problems that can be solved.

For instance, the time hierarchy theorem tells us that P is strictly contained in EXPTIME, and the space hierarchy theorem tells us that L is strictly contained in PSPACE.

There are many different types of reductions, based on the method of reduction, such as Cook reductions, Karp reductions and Levin reductions, and the bound on the complexity of reductions, such as polynomial-time reductions or log-space reductions.

(Since many problems could be equally hard, one might say that X is one of the hardest problems in C.) Thus the class of NP-complete problems contains the most difficult problems in NP, in the sense that they are the ones most likely not to be in P.

The complexity class NP, on the other hand, contains many problems that people would like to solve efficiently, but for which no efficient algorithm is known, such as the Boolean satisfiability problem, the Hamiltonian path problem and the vertex cover problem.

These include various types of integer programming problems in operations research, many problems in logistics, protein structure prediction in biology,[6]

given large but finite resources, especially time), but for which in practice any solution takes too many resources to be useful, is known as an intractable problem.[15]

However, this identification is inexact: a polynomial-time solution with large exponent or large constant term grows quickly, and may be impractical for practical size problems;

conversely, an exponential-time solution that grows slowly may be practical on realistic input, or a solution that takes a long time in the worst case may take a short time in most cases or the average case, and thus still be practical.

Similarly, algorithms can solve the NP-complete knapsack problem over a wide range of sizes in less than quadratic time and SAT solvers routinely handle large instances of the NP-complete Boolean satisfiability problem.

For small n, say 100, and assuming for the sake of example that the computer does 1012 operations each second, the program would run for about 4 × 1010 years, which is the same order of magnitude as the age of the universe.

Even with a much faster computer, the program would only be useful for very small instances and in that sense the intractability of a problem is somewhat independent of technological progress.

Before the actual research explicitly devoted to the complexity of algorithmic problems started off, numerous foundations were laid out by various researchers.

However, [my] initial interest [in automata theory] was increasingly set aside in favor of computational complexity, an exciting fusion of combinatorial methods, inherited from switching theory, with the conceptual arsenal of the theory of algorithms.

These ideas had occurred to me earlier in 1955 when I coined the term 'signalizing function', which is nowadays commonly known as 'complexity measure'.[22]

In 1967, Manuel Blum formulated a set of axioms (now known as Blum axioms) specifying desirable properties of complexity measures on the set of computable functions and proved an important result, the so-called speed-up theorem.

In 1972, Richard Karp took this idea a leap forward with his landmark paper, 'Reducibility Among Combinatorial Problems', in which he showed that 21 diverse combinatorial and graph theoretical problems, each infamous for its computational intractability, are NP-complete.[23]

At that time, computational complexity theory was at its height, and it was widely believed that if a problem turned out to be NP-complete, then there was little chance of being able to work with the problem in a practical situation.

However, it became increasingly clear that this is not always the case, and some authors claimed that general asymptotic results are often unimportant for typical problems arising in practice.[24]

2. Algorithms

Every computer device you have ever used, from your school computers to your calculator, has been using algorithms to tell it how to do whatever it was doing.

However the amount of data computers use is often so large that it doesn't matter how fast the computer is, it will take it far too long to examine every single piece of data (companies like Google, Facebook and Twitter routinely process billions of things per day, and in some cases, per minute!) This is where algorithms come in.

If a computer is given a better algorithm to process the data then it doesn't matter how much information it has to look through, it will still be able to do it in a reasonable amount of time.

are each different ways of describing how to do something, but at different levels of precision: Often you can get away with describing a process just using some sort of informal instructions using natural language;

there's no way that a computer could follow those instructions exactly, but a human could probably get the general idea of what you mean if they know what you're trying to achieve.

This sort of description is only useful for quickly giving another human the general idea of what you mean, and even then there's a risk that they won't properly understand it.

the table could be very big (perhaps we're tracking millions of games and serving up the high score many times each second), that might already be enough to tell us that we need a better algorithm to track high scores regardless of which language it's going to be programmed in; or

the main point is that you could give it to a computer that runs Python, and it would follow the instructions exactly): But here's another program that implements exactly the same algorithm, this time in the Scratch language.

In this chapter we will talk about the cost of an algorithm as either the time it takes a program (which performs the algorithm) to complete, or the number of steps that the algorithm makes before it finishes.

For example, one way of expressing the cost of the high score algorithm above would be to observe that for a table of 10 values, it does about 10 sets of operations to find the best score, whereas

the case of the high scores, if you're running a game that suddenly becomes popular, you want to know in advance that the high score algorithm will be fast enough if you get more scores to check.

The amount of time a program which performs the algorithm takes to complete may seem like the simplest cost we could look at, but this can actually be affected by a lot of different things, like the speed of the computer being used, or the programming language the program has been written in.

This means that if the time the program takes to complete is used to measure the cost of an algorithm it is important to use the same program and the same computer (or another computer with the same speed) for testing the algorithm with different numbers of inputs.

The number of operations (such as comparisons of data items) that an algorithm makes however will not change depending on the speed of a computer, or the programming language the program using the algorithm is written in.


those formalizations included the Gödel–Herbrand–Kleene recursive functions of 1930, 1934 and 1935, Alonzo Church's lambda calculus of 1936, Emil Post's Formulation 1 of 1936, and Alan Turing's Turing machines of 1936–7 and 1939.

No human being can write fast enough, or long enough, or small enough† ( †'smaller and smaller without limit'd be trying to write on molecules, on atoms, on electrons') to list all members of an enumerably infinite set by writing out their names, one after another, in some notation.

But humans can do something equally useful, in the case of certain enumerably infinite sets: They can give explicit instructions for determining the nth member of the set, for arbitrary finite n.

Thus, Boolos and Jeffrey are saying that an algorithm implies instructions for a process that 'creates' output integers from an arbitrary 'input' integer or integers that, in theory, can be arbitrarily large.

Thus an algorithm can be an algebraic equation such as y = m + n – two arbitrary 'input variables' m and n that produce an output y.

From such uncertainties, that characterize ongoing work, stems the unavailability of a definition of algorithm that suits both concrete (in some sense) and abstract usage of the term.

Many computer programs contain algorithms that detail the specific instructions a computer should perform (in a specific order) to carry out a specified task, such as calculating employees' paychecks or printing students' report cards.

Typically, when an algorithm is associated with processing information, data can be read from an input source, written to an output device and stored for further processing.

Algorithms can be expressed in many kinds of notation, including natural languages, pseudocode, flowcharts, drakon-charts, programming languages or control tables (processed by interpreters).

Pseudocode, flowcharts, drakon-charts and control tables are structured ways to express algorithms that avoid many of the ambiguities common in natural language statements.

There is a wide variety of representations possible and one can express a given Turing machine program as a sequence of machine tables (see more at finite-state machine, state transition table and control table), as flowcharts and drakon-charts (see more at state diagram), or as a form of rudimentary machine code or assembly code called 'sets of quadruples' (see more at Turing machine).

However, algorithms are also implemented by other means, such as in a biological neural network (for example, the human brain implementing arithmetic or an insect looking for food), in an electrical circuit, or in a mechanical device.

In computer systems, an algorithm is basically an instance of logic written in software by software developers, to be effective for the intended 'target' computer(s) to produce output from given (perhaps null) input.

An optimal algorithm, even running in old hardware, would produce faster results than a non-optimal (higher time complexity) algorithm for the same purpose, running in more efficient hardware;

Minsky's machine proceeds sequentially through its five (or six, depending on how one counts) instructions, unless either a conditional IF–THEN GOTO or an unconditional GOTO changes program flow out of sequence.

For example, the subprogram in Euclid's algorithm to compute the remainder would execute much faster if the programmer had a 'modulus' instruction available rather than just subtraction (or worse: just Minsky's 'decrement').

Structured programming, canonical structures: Per the Church–Turing thesis, any algorithm can be computed by a model known to be Turing complete, and per Minsky's demonstrations, Turing completeness requires only four instruction types—conditional GOTO, unconditional GOTO, assignment, HALT.

To 'measure' is to place a shorter measuring length s successively (q times) along longer length l until the remaining portion r is less than the shorter length s.[52]

In modern words, remainder r = l − q×s, q being the quotient, or remainder r is the 'modulus', the integer-fractional part left over after the division.[53]

The following algorithm is framed as Knuth's four-step version of Euclid's and Nicomachus', but, rather than using division to find the remainder, it uses successive subtractions of the shorter length s from the remaining length r until r is less than s.

E2: [Is the remainder zero?]: EITHER (i) the last measure was exact, the remainder in R is zero, and the program can halt, OR (ii) the algorithm must continue: the last measure left a remainder in R less than measuring number in S.

This works because, when at last the minuend M is less than or equal to the subtrahend S ( Difference = Minuend − Subtrahend), the minuend can become s (the new measuring length) and the subtrahend can become the new r (the length to be measured);

the domain of the function computed by the algorithm/program, is to include only positive integers including zero, then the failures at zero indicate that the algorithm (and the program that instantiates it) is a partial function rather than a total function.

Proof of program correctness by use of mathematical induction: Knuth demonstrates the application of mathematical induction to an 'extended' version of Euclid's algorithm, and he proposes 'a general method applicable to proving the validity of any algorithm'.[56]

Can the algorithms be improved?: Once the programmer judges a program 'fit' and 'effective'—that is, it computes the function intended by its author—then the question becomes, can it be improved?

For example, a binary search algorithm (with cost O(log n) ) outperforms a sequential search (cost O(n) ) when used for table lookups on sorted lists or arrays.

However, ultimately, most algorithms are usually implemented on particular hardware / software platforms and their algorithmic efficiency is eventually put to the test using real code.

For the solution of a 'one off' problem, the efficiency of a particular algorithm may not have significant consequences (unless n is extremely large) but for algorithms designed for fast interactive, commercial or long life scientific usage it may be critical.

To illustrate the potential improvements possible even in well established algorithms, a recent significant innovation, relating to FFT algorithms (used heavily in the field of image processing), can decrease processing time up to 1,000 times for applications like medical imaging.[61]

Some example classes are search algorithms, sorting algorithms, merge algorithms, numerical algorithms, graph algorithms, string algorithms, computational geometric algorithms, combinatorial algorithms, medical algorithms, machine learning, cryptography, data compression algorithms and parsing techniques.

For example, dynamic programming was invented for optimization of resource consumption in industry, but is now used in solving a broad range of problems in many fields.

Owing to this, it was found to be more suitable to classify the problems themselves instead of the algorithms into equivalence classes based on the complexity of the best possible algorithms for them.

In the United States, a claim consisting solely of simple manipulations of abstract concepts, numbers, or signals does not constitute 'processes' (USPTO 2006), and hence algorithms are not patentable (as in Gottschalk v.

The patenting of software is highly controversial, and there are highly criticized patents involving algorithms, especially data compression algorithms, such as Unisys' LZW patent.

300 BC).[8]:Ch 9.1 Babylonian clay tablets describe and employ algorithmic procedures to compute the time and place of significant astronomical events.[70]

Tally-marks: To keep track of their flocks, their sacks of grain and their money the ancients used tallying: accumulating stones or marks scratched on sticks, or making discrete symbols in clay.

The work of the ancient Greek geometers (Euclidean algorithm), the Indian mathematician Brahmagupta, and the Persian mathematician Al-Khwarizmi (from whose name the terms 'algorism' and 'algorithm' are derived), and Western European mathematicians culminated in Leibniz's notion of the calculus ratiocinator (ca 1680):

good century and a half ahead of his time, Leibniz proposed an algebra of logic, an algebra that would specify the rules for manipulating logical concepts in the manner that ordinary algebra specifies the rules for manipulating numbers.[71]

led immediately to 'mechanical automata' beginning in the 13th century and finally to 'computational machines'—the difference engine and analytical engines of Charles Babbage and Countess Ada Lovelace, mid-19th century.[74]

Lovelace is credited with the first creation of an algorithm intended for processing on a computer – Babbage's analytical engine, the first device considered a real Turing-complete computer instead of just a calculator – and is sometimes called 'history's first programmer' as a result, though a full implementation of Babbage's second device would not be realized until decades after her lifetime.

Logical machines 1870—Stanley Jevons' 'logical abacus' and 'logical machine': The technical problem was to reduce Boolean equations when presented in a form similar to what are now known as Karnaugh maps.

Jevons (1880) describes first a simple 'abacus' of 'slips of wood furnished with pins, contrived so that any part or class of the [logical] combinations can be picked out mechanically .

More recently however I have reduced the system to a completely mechanical form, and have thus embodied the whole of the indirect process of inference in what may be called a Logical Machine' His machine came equipped with 'certain moveable wooden rods' and 'at the foot are 21 keys like those of a piano [etc] .

Another logician John Venn, however, in his 1881 Symbolic Logic, turned a jaundiced eye to this effort: 'I have no high estimate myself of the interest or importance of what are sometimes called logical machines ...

Jacquard loom, Hollerith punch cards, telegraphy and telephony—the electromechanical relay: Bell and Newell (1971) indicate that the Jacquard loom (1801), precursor to Hollerith cards (punch cards, 1887), and 'telephone switching technologies' were the roots of a tree leading to the development of the first computers.[78]

By the mid-19th century the telegraph, the precursor of the telephone, was in use throughout the world, its discrete and distinguishable encoding of letters as 'dots and dashes' a common sound.

Symbols and rules: In rapid succession the mathematics of George Boole (1847, 1854), Gottlob Frege (1879), and Giuseppe Peano (1888–1889) reduced arithmetic to a sequence of symbols manipulated by rules.

in which we see a ' 'formula language', that is a lingua characterica, a language written with special symbols, 'for pure thought', that is, free from rhetorical embellishments ...

The resultant considerations led to Kurt Gödel's paper (1931)—he specifically cites the paradox of the liar—that completely reduces rules of recursion to numbers.

Effective calculability: In an effort to solve the Entscheidungsproblem defined precisely by Hilbert in 1928, mathematicians first set about to define what was meant by an 'effective method' or 'effective calculation' or 'effective calculability' (i.e., a calculation that would succeed).

that the Entscheidungsproblem was unsolvable, Emil Post's definition of effective calculability as a worker mindlessly following a list of instructions to move left or right through a sequence of rooms and while there either mark or erase a paper or observe the paper and make a yes-no decision about the next instruction.[87]

Turing—his model of computation is now called a Turing machine—begins, as did Post, with an analysis of a human computer that he whittles down to a simple set of basic motions and 'states of mind'.

number of efforts have been directed toward further refinement of the definition of 'algorithm', and activity is on-going because of issues surrounding, in particular, foundations of mathematics (especially the Church–Turing thesis) and philosophy of mind (especially arguments about artificial intelligence).

If you're seeing this message, it means we're having trouble loading external resources on our website.

If you're behind a web filter, please make sure that the domains * and * are unblocked.

Intro to Algorithms: Crash Course Computer Science #13

Algorithms are the sets of steps necessary to complete computation - they are at the heart of what our devices actually do. And this isn't a new concept. Since the ...

Algorithm using Flowchart and Pseudo code Level 1 Flowchart

Algorithm using Flowchart and Pseudo code Level 1 Flowchart By: Yusuf Shakeel 0:05 Things we will learn ..

Programming Basics: Creating an algorithm/flowchart and then adding a counter.

via YouTube Capture.

Early Computing: Crash Course Computer Science #1

Hello, world! Welcome to Crash Course Computer Science! So today, we're going to take a look at computing's origins, because even though our digital ...

Algorithms lecture 2 -- Time complexity Analysis of iterative programs


Representing Numbers and Letters with Binary: Crash Course Computer Science #4

Please take our PBS Digital Studios Survey! Today, we're going to take a look at how computers use a stream of 1s ..

Time complexity analysis - How to calculate running time?

See complete series on time complexity here In this lesson, we will see how to ..

Concepts of Algorithm, Flow Chart & C Programming

Concepts of Algorithm, Flow Chart & C Programming by Prof. Wongmulin | Dept. of Computer Science Garden City College-Bangalore.

Maze Solving - Computerphile

Putting search algorithms into practice. Dr Mike Pound reveals he likes nothing more in his spare time, than sitting in front of the TV coding. EXTRA BITS: ...