Installing and Running Hadoop and Spark on Windows
We recently got a big new server at work to run Hadoop and Spark (H/S) on for a proof...
For further actions, you may consider blocking this person and/or reporting abuse
Hi Andrew,
I am getting this error when i try to execute start-yarn.cmd:
This file does not have an app associated with it for performing this action. Please install an app or, if one is already installed, create an association in the defaul apps settings page.
It seems like yarn is not a command known by windows.
Here is my environment variables:
thepracticaldev.s3.amazonaws.com/i...
Hi David,
It sounds like you're trying to run this program by double-clicking on it. You should run it in the
cmdprompt like:Let me know if that works for you.
No. I am using the cmd console.
For example if i try to type just hadoop in console, it shows me some options. But if i try to type yarn it says:
'yarn' is not recognized as an internal or external command,
operable program or batch file.
I am attaching the images where it can be seen: thepracticaldev.s3.amazonaws.com/i...
These error messages are giving you hints about what's going wrong. It looks like your
%PATH%is set up correctly andhadoopis on it, but you can't run thehadoopcommand by itself. That's what the error message is telling you. You need to include additional command-line arguments.Try running
hadoop versionand see if you get any output.When i execute this command "hadoop version" i get this:
Hadoop 2.9.1
Subversion github.com/apache/hadoop.git -r e30710aea4e6e55e69372929106cf119af06fd0e
Compiled by root on 2018-04-16T09:33Z
Compiled with protoc 2.5.0
From source with checksum 7d6d2b655115c6cc336d662cc2b919bd
This command was run using /C:/BigData/hadoop-2.9.1/share/hadoop/common/hadoop-common-2.9.1.jar
But if i try execute just "yarn" i get:
'yarn' is not recognized as an internal or external command,
operable program or batch file.
Right, so
hadoopis working fine.yarnisn't a command that you run, it's just the resource negotiator that the HDFS (Hadoop Distributed File System) uses behind the scenes to manage everything.If you successfully ran
start-yarn.cmdandstart-dfs.cmd, you're good to go! Try uploading a file to HDFS with:...and checking that it's been uploaded with
Hi Andrew ,
It is me again. Now i am testing in my personal machine. But now i ma having another problem. In my local machine my user is "David Serrano". As you can see it has one space in it. When i try to format the namenode with "hdfs namenode -format" I am getting this error:
Error: Could not find or load main class Serrano
Caused by: java.lang.ClassNotFoundException: Serrano
So, i guess the problem is the space in my user name. What can i do in this case?
Thanks in advance!
Hadoop doesn't like spaces in paths. I think the only thing you can do is put Java, Hadoop, and Spark in locations where there are no spaces in the path. I usually use:
Hi,
Well all the files are in paths without spaces. However, hadoop is executing something using my user "David Serrano" and it is generating the problem. I have not found the root cause of this.
Are there any spaces on your
%PATH%at all?Hi,
Here it is my path:
C:\Users\David Serrano>echo %path%
C:\Program Files (x86)\Common Files\Oracle\Java\javapath;C:\Program Files\Microsoft MPI\Bin\;C:\Program Files (x86)\Intel\iCLS Client\;C:\Program Files\Intel\iCLS Client\;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\IPT;C:\Program Files\Intel\Intel(R) Management Engine Components\IPT;C:\Program Files\dotnet\;C:\Program Files\Microsoft SQL Server\130\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\Client SDK\ODBC\130\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\140\Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\140\DTS\Binn\;C:\Program Files (x86)\Microsoft SQL Server\140\Tools\Binn\ManagementStudio\;C:\WINDOWS\System32\OpenSSH\;C:\Program Files\Git\cmd;C:\Program Files\Microsoft SQL Server\Client SDK\ODBC\130\Tools\Binn\;C:\Program Files\Microsoft SQL Server\140\Tools\Binn\;C:\Program Files\Microsoft SQL Server\140\DTS\Binn\;C:\Program Files\Java\jdk-12.0.1\bin;C:\Program Files\MySQL\MySQL Shell 8.0\bin\;C:\Progra~1\Java\jdk-12.0.1;C:\BigData\hadoop-3.1.2;C:\BigData\hadoop-3.1.2\bin;C:\BigData\hadoop-3.1.2\sbin;
As you can see, there are a lot of spaces, however, in the cofngiuration f the variables i am using--> C:\Progr~1 ... in order to avoid spaces problems. But, the problem is with my user "David Serrano". The error says:
Error: Could not find or load main class Serrano
Caused by: java.lang.ClassNotFoundException: Serrano
As you can see in the PATH there is not "Serrano" word. so, my conclusion is that the problem is in my user. But i don't know how i can to avoid this.
Maybe it's doing something with your working directory path? Try
cd-ing toC:\first, then running Hadoop. I'm really not sure, though.hi Andrew
when I run start-dfs.cmd and start-yarn.cmd command it gives me an error
C:\Java\jdk1.8.0_201\bin\java -Xmx32m -classpath "C:\Hadoop\hadoop-3.1.2\etc\hadoop;C:\Hadoop\hadoop-3.1.2\share\hadoop\common;C:\Hadoop\hadoop-3.1.2\share\hadoop\common\lib*;C:\Hadoop\hadoop-3.1.2\share\hadoop\common*" org.apache.hadoop.util.PlatformName' is not recognized as an internal or external command,
operable program or batch file.
The system cannot find the file C:\Windows\system32\cmd.exe\bin.
The system cannot find the file C:\Windows\system32\cmd.exe\bin.
Please help me
Hi پنوں,
It looks like your system variables are mis-configured. The path
Doesn't make any sense, as
cmd.exeis an executable, not a directory. Double-check that you have the environment variables set correctly and let me know if you continue to have issues.There was a problem with environment variables I was trying C:\Windows\system32\cmd.exe\bin but that was prompting an error. but when I changed the system variable with C:\Windows\system32\cmd.exe it was Running fine.
Thank you BOSS for your help Stay blessed.
Happy to help!
Thanks Andrew for this tutorial, it was very helpful.
How does one address this encryption error:
INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = falseI get it after I hadoop put a file.
Hi Chinanu. I haven't encountered an error like this before, so unfortunately your guess is as good as mine. This site seems to suggest that it might be an issue with missing
jarfiles? I'm not sure.Thank you so much, after 4 tutorials and 3 days of trying it finally worked! Yay!!!
For those who might have the same problem as I did: When I used start-dfs.cmd and start-yarn.cmd it said the command couldn't be found. After a quick internet search I figured out that I needed to go to the sbin directory because it's in there and start it from there. Worked fine then.
Glad it worked! I actually went back to follow this guide again recently and skipped over the part where I say to add
\sbinto thePATH, too. No worries!Thanks for putting this together and sharing knowledge. I tried to get Hadoop up and running on my Windows machine last year, and it was painful! Anywho, it encouraged me to put together a blog just like you - exitcondition.com/install-hadoop-w...
Keep Exploring!
Just signed to thank you for this tutorial. Well explained and very clear. Also, thanks for the link to the patch for bin files. I was only able to work with older versions of hadoop and almost tempted to try to build the bins on my own. Cheers!
Hello Andrew,
Thank you so much for your tutorial. I just have some questions hope you can help:
after running start-dfs.cmd and start-yarn.cmd in cmd (boot HDFS step) I noticed that the yarn is working fine but NameNode and DataNode started for a few secs and then both stopped working for some reasons. Any idea what might cause this issue?
during the setting path process, I couldn't run the command: hdfs -version (I cd out to C:/User but I still have the same error Error: Could not find or load main class Last Name) so I edit /etc/hadoop/hadoop-env.cmd and change this line:
set HADOOP_IDENT_STRING=%USERNAME%
to
set HADOOP_IDENT_STRING=myuser
This allows me to do hdfs -version but I don't know this change will affect anything or not could you please clarify? Is this change make my NameNode and DataNode not working
But ... why? Just get Fedora and done ;)
Client-specified software that only runs on Windows Server :/
Well, that's sad. Have you thought about using smth. like an IIS container for those proprietary blobs?
I haven't, no... how would that work? Can you point me to any good resources?
See the Docker hub for more info, although I don't use it personally (I use & write FLOSS exclusivly)
Hi Andrew,
Try Syncfusion BigData Studio and Syncfusion Cluster Manager products. It has builtin Hadoop ecosystems for Windows platform.
Much easier to install and configure Hadoop ecosystems in Windows.
Hi I am following a similar tutorial(joe0.com/2017/02/02/how-to-install...)
earlier when I run the start-dfs.cmd command the hadoop clusters came up with no issues, but after I installed flume, scoop, pig(by following the official websites) now when I entered start-dfs.cmd I am getting below:
Can you please help me solve it
C:\hadoop\hadoop-3.1.2\sbin>start-dfs.cmd
The system cannot find the file hadoop.
The system cannot find the file hadoop.
Hi Andrew,
Thank you for this tutorial. It has really been helpful.
I have been able to run start-yarn.cmd command successfully but whenever I run start-dfs.cmd it gives me an error message
"WARN datanode.Datanode: Problem connecting to server: localhost/127.0.0.1:9000"
Can you please tell me what to do to resolve this issue.
Thank you.
HI ANDREW
when i run start-dfs.cmd and start-yarn.cmd this command it gives me a error msg
My god, I've spent and insane amount of time on this for an assignment, and this was the only thing I've gotten to work. Thank you for putting this together.
Happy to help!
Hi Andrew,
Thanks a lot for this.
This is the only thing that worked for me.
Happy to help!
simply great, it worked like a charm. Thanks for this tutorial Andrew!!
Hi Andrew
This was so clear.
I had been getting problems installing Hadoop for a week, and this just made it a breeze.
Thank you indeed
Thanks Andrew. This post was really helpful.
hi Andrew,
I cant fix this problem:
issues.apache.org/jira/browse/YARN...
Hadoop Version ist 3.1.3
When I start yarn, this folder gets created, with insuficient permissions: /usercache. Running every script with unsificcient permissions, doesnt help.
Thanks a lot in advance!
Rodrigo
Caused by: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Permissions incorrectly set for dir c:/Hadoop/hadoop-3.1.3/yarn/tmp-nm/usercache, should be rwxr-xr-x, actual value = rw-rw-rw-
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.checkLocalDir(ResourceLocalizationService.java:1665)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService.checkAndInitializeLocalDirs(ResourceLocalizationService.java:1633)
Thanks for the guide! Just noticed a small typo with one port number:
localhost:9087 instead of localhost:9870 (I should have looked at the image:)
Thanks for pointing that out! Typo is fixed :)
Hi Andrew, Thank you.
I have followed all the steps on win 7 but when I run hdfs -version ; got an error hdfs is not recognized
please help
Can you give me the exact error message you get? I haven't tried this guide on Windows 7 -- I'm not sure it will work on that OS.
Hi,
I'm getting this error when i execute start-yarn.cmd.
thepracticaldev.s3.amazonaws.com/i...
Help me, please