Learn coding by simple questions

K for What?
4 min readApr 23, 2021

--

Hi folks,

You must be very patient to read my previous blog about momentum strategies, so I really appreciate that. In this blog, I just want to share with you some interesting problems in coding that I have collected from books and my friends. The problems are not neccesarily difficult, they are just ones with something interesting while I’m trying to solve them.

My favorite approach for solving every problem (and everything in life) is simplifying the problem first by divide-and-conquer or solving it in special cases. However, in this blog, I do a reverse way: first I raise a simple problem, then I add more constraints/assumptions to make the problem harder. Now, here we go!

Problem 1: Find mode of a list of numbers

See? It’s very simple, even you are beginning to learn coding, I’m quite sure that you could give the solution for this problem.

I want to show you some ways to calculate the mode of a list. By the way, the first thing is a test function for my solution:

def test(func):
"""
Test efficiency of the algorithm in term of time and accuracy
"""
start = time.perf_counter()
series1 = [1,2,3,4,1]
ans1 = 1
if (func(series1) == ans1) | (func(series1) == [ans1]):
print('Pass case 1')
else:
print('Fail case 1')
series2 = [11,22,1,3,4,2,2,1]
ans2 = 1,2 #there should be two modes in this series
if (func(series2) == ans2) | (func(series2) == [ans2]):
print('Pass case 2')
else:
print('Fail case 2')
series3 = [random.randint(1,1000) for i in range(10**7)]
func(series3)
end = time.perf_counter()
print('Time elapsed:' , end - start)

Mode is the element that appears most in a series. There maybe some built-in functions to calculate mode that you may know as below:

"""
mode in library statistics
"""
test(statistics.mode)

It’s surprising that the function fails test 2, the reason is this built-in function only give out one mode. By the way the function run quite fast.

"""
Use pandas.mode
"""
def use_mode_in_pandas(series):
df = pd.DataFrame(series)
return list(df.mode(axis = 0)[0])

Yay it passes test 2, the code runs slower than the previous.

"""
Use max function
"""
def use_max(series):
return max(set(series), key=series.count)

Again, but it’s worse: fail the test 2 and run extremely slow.

"""
Use Counter in library collections
"""
from collections import Counter
def use_counter(series):
counter = Counter(series)
max_count = max(counter.values())
mode = [k for k,v in counter.items() if v == max_count]
return mode

Maybe it is the best solution for this problem. And this is my code:

def find_mode(list_numbers):
"""
if list has no element so there is no mode, otherwise there may be one or many modes
Will build a dictionary where keys are distinc elements of input, and values are their frequencies in the series
- input:
list_numbers: list of numbers in a series, type list
- output: print list of modes if there is at least a mode, type list. Else print 'Empty list. There is no mode'
"""
if len(list_numbers) == 0:
return print('Empty list. There is no mode')
else:
set_numbers = set(list_numbers)
dict_numbers = dict({})
for x in set_numbers:
dict_numbers[str(x)] = 0
for x in list_numbers:
dict_numbers[str(x)] += 1
result = [int(x) for x in dict_numbers if dict_numbers[x] == max(dict_numbers.values())]
return result

It gives the right answer, but it runs slower:

A little bit sadness! By the way I hope that you have seen different ways to find mode of a series. You can see that not all built-in functions give the answers that you expected, so check it carefully before using!

Thanks again, see you in next blog!

--

--

K for What?

Quant Researcher, Data Scientist, Food Hater (so I eat them a lot).