Running Apache Spark on Windows

Running hadoop on windows is not trivial, however running Apache Spark on Windows proved not too difficult. I came across couple of blogs and stackoverflow discussion which made this possible. Putting down my notes below which are outcome of these reference material.

  1. Download ( )
  2. Download Hadoop distribution for Windows from
  3. Create hadoop_env.cmd  in {HADOOP_INSTALL_DIR}/conf directory.
    SET JAVA_HOME=C:\Progra~1\Java\jdk1.7.0_80
  4. In a new command window run hadoop-env.cmd followed by  {HADOOP_INSTALL_DIR}/bin/hadoop classpath
    The output of this command is used to initialize SPARK_DIST_CLASSPATH in spark-env.cmd (You may need to create this file.)
  5. Create spark-env.cmd in {SPARK_INSTALL_DIR}/conf
     #spark-env.cmd content
     SET HADOOP_HOME=C:\amit\hadoop\hadoop-2.6.0
     set SPARK_DIST_CLASSPATH=<Output of hadoop classpath>
     SET JAVA_HOME=C:\Progra~1\Java\jdk1.7.0_80
  6. Now run the examples or spark shell from {SPARK_INSTALL_DIR}/bin directory. Please note that you may have to run spark-env.cmd explicitly prior running the examples or spark-shell.

References :


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: