Below are the basic points to be remebered about mapreduce..
map: (K1,V1) → list(K2,V2)
reduce: (K2,list(V2)) → list(K3,V3)
- MapReduce works in 2 phase . Map phase and Reduce phase.
- Mapreduce is a data preparation phase and for filteration of unwanted records.
- Reducers have an Iterator which keeps on iterating the values associated with Key.
- Rewriting on the same directory will raise exception.
- 1 reducer = 1 output file.
- we can set number of reducers in mapreduce program.
- Input is divided into Splits.and should be equal to the block size
- if the reducer is not defined the out put will be written to hdfs.
- Read Data Localisation.
- Map write output is local not hdfs and in shuffle and sort phase it is sent to reducers.
- combiner works well in association and commutative processes.
- Combiners cutdown shuffle btw networks in map and reduce phase.
- Hadoop Streaming to run in different languages.
No comments:
Post a Comment