Installing pyspark and spark 3.2 on Windows 10 or 11

HADOOP_HOME=c:\winutils
JAVA_HOME=C:\java11\jdk-11
PYSPARK_PYTHON=C:\python3\python.exe
SPARK_HOME=C:\Users\mike\spark\spark3-hadoop3
Path=C:\Python3\Scripts;C:\Python3;c:\java11\jdk-11\bin;c:\winutils\bin;C:\Users\mike\spark\spark3-hadoop3\bin;

You can add all but the “path” variable to Settings->System->About->Advance system settings -> Environment Variables -> User Variables

path is a system variable:

One thing to notice about all of these PATHS… there are NO spaces.

When I started out installing pyspark and spark, which requires both java and python, on my Windows machine python, and java were installed in paths that contained ” ” / spaces like: C:\Program Files\Git\cmd.

The problem is programs like spark and pyspark don’t work well with the pre-requisites (python, & java) in folders with spaces.

So, my advice, even though you may be able to get things working with spaces is: DO NOT USE SPACES IN FOLDER NAMES.

I had errors because of the spaces, forgotten exactly what the error messages were… but finally I just made the “executive” to:

Uninstall and reinstall java, and python in shorter paths, and more importantly, hadoop, spark pyspark, with NO spaces in the folder names as above. And Voila! fewer head-aches!

By the way here’s how I set the same thing up on a MacBook Pro M1 in my ~/.zprofile (notice NO spaces here – who would put spaces in an install folder – or any folder for that matter on a Mac? Only difference is Apple has a clue they stayed away from the obvious problem – not that you cannot put a space in a MacOS folder – you can and you most likely would shoot yourself in the foot – right before you started running the marathon…

# note: JAVA_HOME is set using backticks to run a program called java_home to get the actual
# JAVA_HOME which in this case is:
# backtick "`" mechanics - return the result of the command wrapped in them
# backtick equivalent ${command}

# echo $JAVA_HOME
/Library/Java/JavaVirtualMachines/temurin-19.jdk/Contents/Home

.zprofile:
export JAVA_HOME=`/usr/libexec/java_home`
export PYSPARK_PYTHON=/usr/bin/python3
export SPARK_HOME=/Users/mike/spark3/spark-3.3.2-bin-hadoop3
export HADOOP_HOME=/Users/mike/spark3/spark-3.3.2-bin-hadoop3
export PATH=$JAVA_HOME:$PYSPARK_PYTHON:$SPARK_HOME/bin:$PATH

Leave a Comment

Scroll to Top