java - Is the Hadoop documentation wrong for set - Stack Overflow

IT技术

更新时间：2025-01-087

admin管理员组
文章数量:1122846

The documentation of the Hadoop Job API gives as example:

From .3.5/api/org/apache/hadoop/mapreduce/Job.html

Here is an example on how to submit a job:

     // Create a new Job
     Job job = Job.getInstance();
     job.setJarByClass(MyJob.class);
     
     // Specify various job-specific parameters     
     job.setJobName("myjob");
     
     job.setInputPath(new Path("in"));
     job.setOutputPath(new Path("out"));
     
     job.setMapperClass(MyJob.MyMapper.class);
     job.setReducerClass(MyJob.MyReducer.class);

     // Submit the job, then poll for progress until the job is complete
     job.waitForCompletion(true);

This seems natural to attach the input and output paths to the job.

However, the setInputPath()and setOutputPath() methods do not exist in the API (see the rest of the documentation page).

And this example actually does not compile.

The correct way to set it is to replace those two instructions with:

FileInputFormat.addInputPath(job, new Path("in"));
FileOutputFormat.setOutputPath(job, new Path("out"));

The use of these methods (doc here [1]), seems pretty inelegant to me as I can't see why it cannot be a function of the Job (as the example in the doc intended to show).

My question is: can an expert

give an explanation of the rationale,
confirm the documentation is wrong.

[1] .3.5/api/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.html

The documentation of the Hadoop Job API gives as example:

From https://hadoop.apache.org/docs/r3.3.5/api/org/apache/hadoop/mapreduce/Job.html

Here is an example on how to submit a job:

     // Create a new Job
     Job job = Job.getInstance();
     job.setJarByClass(MyJob.class);
     
     // Specify various job-specific parameters     
     job.setJobName("myjob");
     
     job.setInputPath(new Path("in"));
     job.setOutputPath(new Path("out"));
     
     job.setMapperClass(MyJob.MyMapper.class);
     job.setReducerClass(MyJob.MyReducer.class);

     // Submit the job, then poll for progress until the job is complete
     job.waitForCompletion(true);

This seems natural to attach the input and output paths to the job.

However, the setInputPath()and setOutputPath() methods do not exist in the API (see the rest of the documentation page).

And this example actually does not compile.

The correct way to set it is to replace those two instructions with:

FileInputFormat.addInputPath(job, new Path("in"));
FileOutputFormat.setOutputPath(job, new Path("out"));

The use of these methods (doc here [1]), seems pretty inelegant to me as I can't see why it cannot be a function of the Job (as the example in the doc intended to show).

My question is: can an expert

give an explanation of the rationale,
confirm the documentation is wrong.

[1] https://hadoop.apache.org/docs/r3.3.5/api/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.html

Share Improve this question edited Nov 23, 2024 at 15:45 OneCricketeer 191k20 gold badges141 silver badges267 bronze badges asked Nov 22, 2024 at 10:51 user1551605 1931 silver badge10 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

My "non expert" opinion - if the code compiles, it is not wrong. If it does not, then it's worthy of a JIRA ticket to the Apache Hadoop project (maybe one already exists).

Just because Javadoc is missing (somehow), doesn't necessarily mean it's wrong. Open source documentation is hard, and code comments are not ( regularly) checked for compilation errors...

In any case, I don't know a single person that explicitly writes mapreduce API code anymore without a higher level abstraction, so what's the use case you're trying to solve?

本文标签： javaIs the Hadoop documentation wrong for setStack Overflow

版权声明：本文标题：java - Is the Hadoop documentation wrong for set - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1736304305a1932189.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

java - Is the Hadoop documentation wrong for set - Stack Overflow

1 Answer 1

更多相关文章

java - Is the Hadoop documentation wrong for set - Stack Overflow

发表评论

推荐文章

plugins - Elementor Form : client side javascript validation

azure - Issues with Backend Switching Logic in API Management for OpenAI Deployments - Stack Overflow

ios - How to Choose Good Values for minimum, maximum, and preferred in CAFrameRateRange? - Stack Overflow

javascript - How can I capture photovideo in WebXR threejs? - Stack Overflow

Slicer filter on embedded power bi by API - Stack Overflow

热门文章

Wordpress plugin update View version remove Update now

plugins - wp_remote_get() returns 403 while file_get_contents() does not

add custom filter to plugins page?

translation - Replace text string on individual page

php - Generate random number 1-10 upon registration without repeat

javascript - Tailwind classes not extracting from component even when mentioned under content within tailwind.config.ts - Next.j

civet - how can i get this import to work in deno? - Stack Overflow

uploads - Cannot add image (png, jpeg, etc.) but can add pdf, 403 forbidden when posting

sql server - HTTP nested variable JSON response to collection or SQL using vb.net - Stack Overflow

c# - RabbitMQ consumer does no catch all messages from exchange - Stack Overflow

最新文章

Java入门级教学（IDEA的下载与安装与JDK的环境配置）

华硕笔记本电脑用U盘重装windows系统

物理网卡MAC修改器v3.0 - 真实网卡硬件MAC地址修改，重装系统不变！

如何一键安装win7系统(一键安装win7系统步骤)

Windows 11最稳定版本详解

winapi - Win32 DrawText() ignores text color set on the device context and draws text in background color - Stack Overflow

How to get Graalvm to convert AWT Java program to exe - Stack Overflow

Embedding of sequence of events sets - Stack Overflow

hcl - How to create parallel builds foreach item in list using packer template - Stack Overflow

react hooks - My browser localstorage clears everytime i refresh - Stack Overflow

惠普OMEN 15-CE001TX 2EF91PA参数报价

苹果新款MacBook Pro 15英寸 i732GB1TBVega Pro 20参数报价

联想Y330A-PSE L参数报价

神舟战神Z7 D6 i7-12650H16GB512GBRTX4050旗舰版参数报价

神舟战神Z7 D6 i7-12650H16GB1TBRTX4050参数报价