Minni: Lightweight MapReduce Library

Issue 26: Wordcount results not being written to output file

Reported by Hrishikesh Amur, Jul 1, 2010

Athula:  okay..so basically it didnt get a ready status..but got 
done or dne and it came out of the loop and hence there is nothing 
in the aggregators. Because of that it doesnt write anything into 
the file


Logs are attached:
./test &> ./test.log
./workdaemon &> workdaemon.log

Comment 1 by Wolfgang Richter, Jul 1, 2010

Labels: Priority:Critical Priority:Medium
Status: Accepted
Owner: abalacha

Comment 2 by Hrishikesh Amur, Jul 1, 2010

Attaching tarball of job0_partition0.map.

Comment 3 by Hrishikesh Amur, Jul 1, 2010

Last one corrupted. This one should be fine. Attaching test and 
workdaemon log files from next run too.

Comment 4 by Hrishikesh Amur, Jul 1, 2010

reposting map local file after bound fix.

Comment 5 by Wolfgang Richter, Jul 1, 2010

Bound fix in commit 098e61.

Comment 6 by Wolfgang Richter, Jul 1, 2010

Master was missing sending a message notifying the workdaemon that 
there were nodes with finished map data.  Fix in commit f8f76b9.  
Patch attached.

Now a new issue was just uncovered, could be in lib code or 
workdaemon code:

"Reducer: Going to do reduce on the file 
/home/wolf/Dropbox/CMU/Courses/OS_and_Distributed_Systems/wolf_repo/s
rc/worker/gen-cpp/job3.reduce
I am here to do reduce on 
/home/wolf/Dropbox/CMU/Courses/OS_and_Distributed_Systems/wolf_repo/s
rc/worker/gen-cpp/job3.reduce
Inside and now going to read the file 
Reading errorReducer: Done with reducing"

Comment 7 by Wolfgang Richter, Jul 2, 2010

New hot fix in commit 6ac2399, potentially fixing the previous 
issue.

This one will guarantee that a message about nodes with data arrives 
before the allmapsfinished message.

If Erik's code is correct, it will then have a chance to grab the 
data before running reducers without a list of nodes to pull 
intermediate data from.  This time the list should *definitely* be 
populated with the new synchronous call.

This patch needs to be verified...

Comment 8 by Wolfgang Richter, Jul 2, 2010

Reducer still having error.  Partial log attached.

It seems it's getting a file name, but for some reason reads fail 
(or are being reported as failures) on that file?

After job completion, the file exists and is also attached to this 
issue.

Comment 9 by Wolfgang Richter, Jul 2, 2010

I think we'll need more debug info for this one.

Athula should probably add a bunch of debug statements telling us 
the length of the file, read bytes, etc. so we can try and narrow 
down the issue.

Comment 10 by Wolfgang Richter, Jul 2, 2010

Commit 7e05f68 should provide a final fix for this issue.

Patch is attached.

Please verify.

Problem was that fread return value unit was being interpreted as 
bytes when it's really the count of "elements" read 
(elements are arbitrary in the number of bytes).

Comment 11 by Wolfgang Richter, Jul 2, 2010

Hrishi has verified, looks like this chain of fixes and discussion 
are coming to an end, closing out.
Status: Fixed

Created: 15 years 19 days ago by Hrishikesh Amur

Updated: 15 years 18 days ago

Status: Fixed

Owner: Athula Balachandran

Labels:
Priority:Critical
Type:Defect