Kadane’s algorithm – finding maximum sum in contigous sub-array
This article is originally published at https://tomaztsql.wordpress.com
Great algorithm for analyzing timeseries data or array of numbers.
An algorithm for finding the contiguous subarray within a one-dimensional array of numbers that has the largest sum. It is called Kadane’s algorithm.
How does the algorithm work? It looks for a global maximum of positive-sum on any sub-array, regardless of its starting position or its length. Numbers must be next or together in a sequence (without any number left out).
With formula as: local_maximum at index i is the maximum of A[i] and the sum of A[i] and local_maximum at index i-1. So, you calculate the sum of every possible subarray ending with the element array[n-1]. Then, we would calculate the sum of every possible subarray ending with array[n-2], array[n-3] and so on up to array[0].
With a simple function we can iterate through the array (vector) of values:
#sample vector
v <- c(-3,-8, 1, -2, 1, 5, -3, -4, 3, 10, -2, 4, 1)
kadane <- function(v){
max_so_far = -999999999999999 #some obnoxiously big number
max_ending_here = 0
for (i in 1:length(v)) {
max_ending_here = max_ending_here + v[i]
if (max_so_far < max_ending_here ){
max_so_far = max_ending_here
}
if (max_ending_here < 0) {
max_ending_here = 0
}
}
return (max_so_far)
}
Run the function with:
# run the function
kadane(v)
The function can also return the starting and ending positions of the array.
kadane_with_position <- function(v){
max_so_far = -999999999999999 #some obnoxiously big number
max_ending_here = 0
subarray_start = 0
subarray_end = 0
int_s = 0
for (i in 1:length(v)) {
max_ending_here = max_ending_here + v[i]
if (max_so_far < max_ending_here ){
max_so_far = max_ending_here
subarray_start = int_s
subarray_end = i
}
if (max_ending_here < 0) {
max_ending_here = 0
int_s = i + 1
}
}
cat("Sum is: ", max_so_far, " with starting position: ", subarray_start, " and ending: ", subarray_end)
}
# run the function
kadane_with_position(v)
Kadane’s algorithm has also an interesting time complexity function. With O(n) for an array, it can explode exponentially when using 2D matrices: O(n^3).
R library Adagio offers this function with ability to calculate over matrices.
As always, code is available at Github in the same Useless_R_function repository. Check Github for future updates.
Happy R-coding and stay healthy!“
Thanks for visiting r-craft.org
This article is originally published at https://tomaztsql.wordpress.com
Please visit source website for post related comments.