安装pyspark kernel
  TEZNKK3IfmPf 5天前 23 0
mkdir ~/.ipython/kernels/pyspark
vim ~/.ipython/kernels/pyspark/kernel.json

kernel.json 内容

{
"display_name": "pySpark",
"language": "python",
"argv": [
"/var/local/anaconda2/bin/python",
"-m",
"IPython.kernel",
"-f",
"{connection_file}"
],
"env": {
"JAVA_HOME": "/opt/jdk8",
"SPARK_HOME": "/usr/hdp/3.0.1.0-187/spark2",
"PYTHONPATH": "/usr/hdp/3.0.1.0-187/spark2/python:/usr/hdp/3.0.1.0-187/spark2/python/lib/py4j-0.10.7-src.zip",
"PYTHONSTARTUP": "/usr/hdp/3.0.1.0-187/spark2/python/pyspark/shell.py ",
"PYSPARK_SUBMIT_ARGS": "pyspark-shell"
}
}

实验验证

import os
#os.environ['SPARK_HOME']='/usr/hdp/3.0.1.0-187/spark2/'
#os.environ['JAVA_HOME']='/opt/jdk8'


from pyspark import SparkContext, SparkConf
# #Spark Config
conf=SparkConf().setAppName("testspark").setMaster("spark://10.244.0.29:7077")
sc = SparkContext(conf=conf)

text_file = sc.textFile("hdfs:///root/test/spark/test.txt")

counts = text_file.flatMap(lambda line: line.split(" ")) \
.map(lambda word: (word, 1)) \
.reduceByKey(lambda a, b: a + b)

print counts
【版权声明】本文内容来自摩杜云社区用户原创、第三方投稿、转载,内容版权归原作者所有。本网站的目的在于传递更多信息,不拥有版权,亦不承担相应法律责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱: cloudbbs@moduyun.com

  1. 分享:
最后一次编辑于 5天前 0

暂无评论

推荐阅读
  TEZNKK3IfmPf   5天前   13   0   0 java
TEZNKK3IfmPf