Algorithms: Algorithm Specs, Complexity & Implementation


Algorithms: Algorithm Specs, Complexity & Implementation

Introduction

Algorithm is a precisely defined procedure implemented with a finite amount of instructions that execute a specific task. Few requirements:

  • often has one or more entry values (not exclusively)
  • at least 1 output value (results)
  • completion in finite amount of time

An algorithm has a deterministic behaviour if it’s behaving the same under same conditions. It’s predictive and always producing the same results. There are “correct” algorithms, producing correct/expected results for input values and “incorrect” algorithms that produce unexpected results or simply don’t end (pseudoalgorithms).

Algorithm specification

They’re usualy expressed via:

  • natural language
  • graphic forms (flow diagram)
  • programming language
  • pseudo language

I’ll not go into too much details, but for pseudo language (in the following articles) I’ll share some basic conventions:

  • while, repeat, for, if , case, loop (similar to pacal)
  • identation is included in the body structure, end + structure name (end_while, end_if)
  • Single and multiple value assignment (a=b & a=b=c)
  • value replacement, a<->b
  • local & global variables
  • ".." – array range
  • Pointers. Object (containing fields x and y) with a pointer p on it, we can access it with x(p) and y(p)
  • Record. Reference: <Name>.<field>
  • Procedure parameter transfer is by value & reference
  • Recursion is allowed
  • Informal langauge constructions are allowed (for each, find, etc.)
  • Calls to previously defined Procedures/Functions are marked with Upper case letters (ERROR, INPUT, etc.)
  • Comments are excluded

Algorithm Analysis (Complexity)

Usually primary measure of an algorithm is efficiency and performance. Efficiency is measured by computer resources on execution (time and space). Space is less limiting factor nowdays so execution time is main/primary criterium.

Algorithm time depends on:

  • Number of instructions and their execution time in cycles (architecture)
  • Mashine code generated by a compiler
  • entry data

First two don’t reflect the essence of the algorithm, so determing the algorithm performance relies on calculating time complexity of the algorithm itself (depending on the size of the problem). For the size of a problem, we’re going to take one parameter that mostly affects the time of the execution and that’s the input data (n). Algorithm complexity will be expresssed as a function f(n).

We can determine the algorithm complexity using mathematics, combinatorics, algebra transformations, etc. but some of them can lead us to unsolvable mathematical models. Complexity of such models is determined via simulation or empirically (measures after implementation)

Examples:

Analysis usualy provides a math. expression as a result, containing one or more members that depends on the size “n” and constants. To determine the algorithm performance it’s enough to know the order of magnitude. With that, complexity gets approximated with the fastest growing member.

For e.g. if we get: f(n) = ajn^j + .. a1n^1 + a0n^0, the complexity is in magintued of n^j, ignoring all members with lower degree than that (the attribution becomes irrelevant for high n values, including a constant factor aside the highest degree).

Usual approach for expressing the algorithm complexity is O-notation. For f(n) we say it’s O(g(n)) if there are positive constants c and n0 such that f(n) <= cg(n) for any n>=n0. Than we have lim f(n)/g(n) = c when n => infinity. This is the approach to find upper limit of complexty, because f(n) is asymptotically limited by g(n).

If we have two program segments P1 & P2 with complexity f1(n) and f2(n), total complexity of these two segments is O(max(f1(n), f2(n)). If P2 is combined with P2 (sequence), complexity is: O(f1(n) * f2(n)). For O-notation there’s a transitivity. If f(n) is in order of O(g(n)), and g(n) is in order of O(h(n)) then f(n) = O(h(n)).

The O-notation is not ideal method to determine algorithm complexity. It easily determines the upper limit, but due to approximations it doesn’t say anything about the real algorithm performance. It also excludes constant member that can be vital in some cases.

For e.g. if we have: n=100, the f(n)=2n^3 is more efficient than f(n) = 300n^2 + 20n + 1, although the first one has a complexity of O(n^3) and the second one O(n^2).

Because of those problems, sometimes we need to determine the lower limit of complexity using Ω-notation. For f(n) we say it’s Ω(g(n)) if there are positive constants c and n0, such that f(n) >= cg(n) for any n >= n0.

If some function f(n) fulfils both relations f(n) = O(g(n)) and f(n)=Ω(g(n)) we say f(n)= Θ(g(n)). That represents both the upper and lower asymptotical performance limit. Positive c1,c2 and c0 such that 0<=c1g(n) <=f(n) <= c2g(n) for any value n>=n0.

Aside from the number of input sets/data, their value also determine the algorithm performance. With that in mind we can have the best (minimum complexity), avg and worst case scenarios. The analysis of AVG case provides the essence of the algorithm. It’s the most difficult one since we don’t know what is the “average case” (not all inputs/values are equally probable). Usualy we’re relying on an analysis with a random inputs.

Worst case analysis provides us with the guarantee perfomance (upper limit):

  • O(1) – constant algorithms : complexity doesn’t depend on the input data. Most desireable algorithms (rarest). E.g. adding an element to the begining of a linked list [Polinomial]
  • O(log n) – logaritmic algorithms : pretty popular due to slow & constant growth. If not explicitly defined, the base is 2 (log 2 n). E.g. binary search of sorted set [Polinomial]
  • O(n) – linear algorithms : cycles, executing n times, usually processing all input data. E.g. sequential search of unordered array. [Polinomial]
  • O(n log n) – linear logaritmic algorithms : binary decision, splitting the problem, processing all input data. E.g. best sort algorithms have this complexity (key comparisson) [Polinomial]
  • O(n^2) – square algorithms : usual form is a 2 nested loops with n size. E.g. direct sort methods. [Polinomial]
  • O(n^k) – degree algorithms : form of k nested loops size of n, with a significant growth of complexity for big n. Acceptable for computer solving, especially for medium values of k [Polinomial]
  • O(k^n), k > 1 – exponential algorithms : these are not convenient for computer solving (for high dimensions), due to extreme growth of exponential funcion. E.g. brute force
* polinomial – complexity can be represented with polynomial variables

Algorithm implementation

The algorithm implementation can affect its efficiency. It depends on the programming language and on the computer system running it. If we have an algorithm running numerous times (frequent processing or cycles), optimization is very important. On the other hand, algorithms that are executed rarely, usual focus is on low complexity (easily understandable, coding & testing).

In our decision related to algorithm selection we shouldn’t rely exclusively on its complexity, but also on the conditions in which the algorithm is going to execute. None of the algorithms is ideal in all conditions, so we should take that into consideration.

Conclusion

This was just the basic introduction on specs, analysis and implementation. Although nowdays we don’t lack memory, we should balance out memory usage with time efficiency. They’re tightly connected. As mentioned, most of algorithm complexity assesment is an aproximation, but knowing how to approach them, how to analyse and read them will definitely come in handy at some point.