Minni: Lightweight MapReduce Library

Issue 9: [dfs] read smaller blocs

Reported by Athula Balachandran, Jun 16, 2010

We need to add the functionality to read smaller blocks to DFS 
module. This would lead to lower memory footprint.

Subsequent modifications to the Mapper code also needed!

Comment 1 by Wolfgang Richter, Jun 16, 2010

Actually, the function:

int64_t readChunkOffset(string path, uint64_t offset, char* buf, 
uint64_t length);

Can read arbitrarily size buffers based on the length parameter.

There is an inefficiency that needs fixing here.

The file is opened and closed on each call to this function.  We 
should probably leave it open for readahead and explicitly close it 
later (using the DFS API).

Also, this DFS layer should be refactored into it's own library 
common to the project, not part of the master's source subtree.
Status: Accepted

Comment 2 by Wolfgang Richter, Jun 17, 2010

Refactored DFS into it's own separate convenience library.  Build 
structure is simplified now.

Next step is to modify reads to take advantage of readahead.

Comment 3 by Wolfgang Richter, Jun 17, 2010

Status: Started

Comment 4 by Wolfgang Richter, Jun 18, 2010

Refectored DFS layer to use file cache so that we don't open and 
close files all the time.

This should close out this issue, except for changes needed in the 
user library.

Is anything more needed from the DFS layer?

Comment 5 by Wolfgang Richter, Nov 22, 2010

Closing as this is fixed as far as I know.
Status: Fixed

Created: 15 years 1 month ago by Athula Balachandran

Updated: 14 years 7 months ago

Status: Fixed

Owner: Wolfgang Richter

Followed by: 1 person

Labels:
Type:Enhancement
Priority:Medium