CSC148

Sunday 1 December 2013

#13 Sorting Algorithm

This is final post of the semester...

Sorting Algorithms:
Bubble, Selection, Insertion, Merge Sort, Quick Sort.

Bubble Sort
First I would like to discuss Bubble sort the very old fashion. Bubble sort is the very intuitive way for human beings to sort. However this sorting algorithm is takes the longest for computer to execute due to the nature of it. Based on the Big-O theory, the amount of time it takes to complete the whole sorting process is n^2 which is considered very long. However, it is not always the worst choice. For instance, when sorting a small list, it could be a good alternative to use bubble sort since the size of the list to be sorted is rather small. However, it is definitely not a good way when the size of the list is huge.

Selection Sort:
Selection Sort works by repeatedly selecting the smallest element list and putting it to right place. This algorithm is better than Bubble Sort since it get rids of the unnecessary comparisons. Every time we select out the smallest element that's in the unsorted pile. At the beginning, the unsorted list is of size n, which gradually reduces to 0 as we sort through the list. So the Big-O would be 1+2+...+n which is (n^2+n)/2 which has a asymptotic curve of n^2/2. So the amount of time to sort a list of large size would be a factor of 0.5 as opposed to Bubble Sort.

Insertion Sort
Insertion Sort is done by repeatedly inserting the next item in the already sorted portion of a list. Again this is improved based on Bubbly Sort and the amount to run through the sorting is about 1/2 of what it takes in Bubble Sort.

Merge Sort:
Recursively divide the unsorted list and sort them into sorted order, then merge them with other sorted portions. This sorting algorithm inevitably employs recursion and the amount of time to complete the sorting has nothing to do with the unsorted list. No matter how messy the list is the Big Oh would be nlogn. This works better when the unsorted list is of large size and it would be very time consuming to employ the above sorting algorithms.

Quick Sort:
Recursively select a Pivot and break the list into two sublists which are of either smaller or larger values than the 'Pivot'. Again this can only be done with the help of recursion. It should also be noted that the algorithm works the worse when we have a 'reversed' list. The way to go around this scenario is to randomize the selection of the 'Pivot' such that we can reduce the number of computations.

Based on the lab 9, there are many things that we can do to improve the performance of the algorithms discussed above.

Bubble Sort:
Use a temporary variable to store the length of the list so we don't need to use the built-in len() function to evaluate the length of the unsorted list over and over.

Insertion Sort
Use built-in del() instead of moving all the elements that follow the element deleted.

Quick Sort:
Refer to above.

Last couple comments:
Timsort:
Make use of all the algorithms discussed above and choose the best possible way to sort the list.

Count Sort:

Great for large sizes list but has limitations on the relationship between the list size and the variation between the largest element and the smallest element of the list. If the conditions are not satisfied, the algorithm would not work.

#12 Topics covered in the last week

Map Reduce

As discussed in lecture slide, map reduce is sort of like list comprehensions. By introducing the concept of Map Reduce, we can combine iterables into a single reduced value. The basis of this concept is that many computations conducted can be treated as parallel such that we can reduce. For non-parallel computations, this concept may not apply. Just like the examples of list comprehensions, we generalize a list using a single expression, all the elements that will be generated can be treated as 'parallel' since the operations used to generate those elements are identical.

Tracing

We were introduced the following terminologies:
Local, Non-Local, Global.
Fairly simple concepts that were explained through a comprehensive example in class. Basically Global is variables that's in the global scope. Local only refers to the scope where the variable is evaluated. If a local variable that sits in global scope, then it is also a global variable. Finally, Non-Local refers to variables that sit one level above the current scope. For example, in the case of a helper function, a Non-Local variable would refer to the variable that sits in the function which calls the helper function.

Finally, Memorization

I previously employed this concept in Assignment 1. Basically this allows us to omit the redundant recursion calls by which we reduced the demand for the computer to compute and improve on efficiency. In assignment 1, I built a dynamic memory which memorizes the best cuts for cheeses starting for 1 cheese to n cheeses. By doing so I was able to find out the best cuts very easily based on the previous computations and utilized the information in execution. This is a great tool that we can use to reduce the unnecessary computations.

Sunday 24 November 2013

#11 Hiding attributes&equality&mapreduce

This week features 3 topics as the title of this post suggests.

Hiding attributes:
Sometimes we run into problems by allowing modifications to attributes outside of classes. In order to prevent such problems we were introduced a method which "modifies" the property of attributes of classes. By doing so we have better control on the attributes and prevent any undesired modifications to them.
As far as I am concerned, this kind of situation have yet to concerned us since we have only been dealing with small piece of codes. However, when dealing with large piece of codes or in the situations where multiple people collaborate to generate one large piece of code, have such tools would be highly preferred.

Equality:
By definition equality means two identical objects. But in the world of programming, equality could vary by programmers self definitions. To define equality by ourselves we can rewrite the __eq__ method which allows us to self define equality.

Mapreduce:
To be discussed in detail in Post #12

#10

This is a late post for last week. So we had no class and wrote our second term test on Wednesday. It was supposed to be an easy test but somehow I mis-interpreted the third question. Such a tragedy.

The material of the course is not hard but I guess I was off my mind on the day. Hope it will get better in the final exam...

Saturday 9 November 2013

#9 Sorting algorithms

Couple of sorting algorithms were introduced this week during lectures.

These are:
selection sort,
merge sort,
quick sort
count sort
built-in sort (timsort)

We implemented the first 4 during lectures and evaluated the performance of each to see how efficient they are. It turns out that built-in sort outperforms most of the other sorting algorithms except count sort when the pool of data to be sorted it rather significant.

So I became curious on the algorithms that timsort implemented. And I was directed to the code Tim Peters implemented using C. Wow I always thought nobody uses C since its low level programming language which is considered painful to compose. So I had a brief discussion with my lab TA on the pros and cons of different programming languages. I was genuinely surprised when he told me the reason why IOS is in a way more smooth than Android. The ground difference between them is the programming language they used. IOS is based on C while Android is based on Java. While C is harder to implement, it allows programmer to take more control such that the program can run more smoothly.

I always thought knowing C is less advantageous as opposed to C++/Java...

#8 Algorithms

I have been wondering since I started learning C 4 years ago:
There are so many ways of doing things. How do we decide the best out of all the alternatives.

Here we move on to the new topic, which relates to algorithms of computing.
Big Oh.

Some of the models were discussed in class and they all have distinct Big Ohs depending on the algorithm.

Only a brief introduction this week so I will talk more about it next week.

#6 Assignment 2

First comment that I have on assignment 2 is it is significantly shorter than assignment 1.

However, the difficulty of it is considered more or less the same as opposed to assignment 1. Only difference is the later assignment puts more emphasis on recursion.

The first challenge that came was parsing. At first I was trying to parse the string recursively. Every time we peel off the braces and parse whatever that's in the braces. However since the existence of the *s, it becomes less effective to implement using recursion. So I figured I would be better off using what I already mastered from my experience in C --- Loop/Boolean.

Surprisingly simple to implement using the old fashion. Sometimes it's good to stick to the old ways I guess. lol.

The second part was rather straight forward. The base cases was easy to generalize. Only thing that I got stuck on was that due to nature of computer, it could get too greedy which matching things. Or sometimes it could get too reluctant. These conditions can never co-exist without artificial controls.

The only solution that came to my mind was to take into account all the possibilities, which, in nature, is very inefficient. So I had a brief convo with Danny after class and it turned out that he suggested the same way of solving the problem. The more efficient way will be covered in a second year course..

So I decided to go for the inefficient solution "any[list comprehensions]". And problem solved.