Minni: Lightweight MapReduce Library

Issue 24: [lib] Implement Eviction Policy for In Memory Hashtable

Reported by Wolfgang Richter, Jul 1, 2010

As per design discussions, allow the in memory hashtable to fill to 
a certain configurable/overridable value (passed through the 
properties of a job).

Flush to a file when it fills, merging if the file already exists.

Reducers just merge all data together (in the same way as a mapper). 
and once they have everything from all mappers (get the all finished 
signal), they write out the final results to the DFS.

Also, both mappers and reducers should have a final merge even if 
their hashtable isn't full whenever they finish processing all of 
their input data.

Comment 1 by Wolfgang Richter, Jul 3, 2010

Initial skeleton code supporting this with an 
[[ExtendableHashtable]] is now in commit d9b620d, branch 
external_sort.

Created: 15 years 19 days ago by Wolfgang Richter

Updated: 15 years 17 days ago

Status: Accepted

Owner: Athula Balachandran

Followed by: 1 person

Labels:
OpSys:All
Type:Enhancement
Component:Logic
Other:Usability
Priority:Critical