admin管理员组文章数量:1122846
The documentation of the Hadoop Job API gives as example:
From .3.5/api/org/apache/hadoop/mapreduce/Job.html
Here is an example on how to submit a job:
// Create a new Job
Job job = Job.getInstance();
job.setJarByClass(MyJob.class);
// Specify various job-specific parameters
job.setJobName("myjob");
job.setInputPath(new Path("in"));
job.setOutputPath(new Path("out"));
job.setMapperClass(MyJob.MyMapper.class);
job.setReducerClass(MyJob.MyReducer.class);
// Submit the job, then poll for progress until the job is complete
job.waitForCompletion(true);
This seems natural to attach the input and output paths to the job.
However, the setInputPath()
and setOutputPath()
methods do not exist in the API (see the rest of the documentation page).
And this example actually does not compile.
The correct way to set it is to replace those two instructions with:
FileInputFormat.addInputPath(job, new Path("in"));
FileOutputFormat.setOutputPath(job, new Path("out"));
The use of these methods (doc here [1]), seems pretty inelegant to me as I can't see why it cannot be a function of the Job (as the example in the doc intended to show).
My question is: can an expert
- give an explanation of the rationale,
- confirm the documentation is wrong.
[1] .3.5/api/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.html
The documentation of the Hadoop Job API gives as example:
From https://hadoop.apache.org/docs/r3.3.5/api/org/apache/hadoop/mapreduce/Job.html
Here is an example on how to submit a job:
// Create a new Job
Job job = Job.getInstance();
job.setJarByClass(MyJob.class);
// Specify various job-specific parameters
job.setJobName("myjob");
job.setInputPath(new Path("in"));
job.setOutputPath(new Path("out"));
job.setMapperClass(MyJob.MyMapper.class);
job.setReducerClass(MyJob.MyReducer.class);
// Submit the job, then poll for progress until the job is complete
job.waitForCompletion(true);
This seems natural to attach the input and output paths to the job.
However, the setInputPath()
and setOutputPath()
methods do not exist in the API (see the rest of the documentation page).
And this example actually does not compile.
The correct way to set it is to replace those two instructions with:
FileInputFormat.addInputPath(job, new Path("in"));
FileOutputFormat.setOutputPath(job, new Path("out"));
The use of these methods (doc here [1]), seems pretty inelegant to me as I can't see why it cannot be a function of the Job (as the example in the doc intended to show).
My question is: can an expert
- give an explanation of the rationale,
- confirm the documentation is wrong.
[1] https://hadoop.apache.org/docs/r3.3.5/api/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.html
Share Improve this question edited Nov 23, 2024 at 15:45 OneCricketeer 191k20 gold badges141 silver badges267 bronze badges asked Nov 22, 2024 at 10:51 user1551605user1551605 1931 silver badge10 bronze badges1 Answer
Reset to default 0My "non expert" opinion - if the code compiles, it is not wrong. If it does not, then it's worthy of a JIRA ticket to the Apache Hadoop project (maybe one already exists).
Just because Javadoc is missing (somehow), doesn't necessarily mean it's wrong. Open source documentation is hard, and code comments are not ( regularly) checked for compilation errors...
In any case, I don't know a single person that explicitly writes mapreduce API code anymore without a higher level abstraction, so what's the use case you're trying to solve?
本文标签: javaIs the Hadoop documentation wrong for setStack Overflow
版权声明:本文标题:java - Is the Hadoop documentation wrong for set - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1736304305a1932189.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论