How to compute cosine similarity between two words in Word2Vec model in pyspark

Home / Uncategorized / How to compute cosine similarity between two words in Word2Vec model in pyspark

Question:
When I use the python library gensim and train a Word2Vec model, I can call the function like this word2vec_result.similarity(‘apple’,’banana’) to get the cosine similarity between apple and banana at local machine.
But in pyspark(version2.2), I can’t find the same function in the document after the model built.

Code:#!/usr/bin/env python
# -*- coding: utf-8 -*-
from pyspark.mllib.feature import Word2Vec
from pyspark.mllib.feature import Word2VecModel
from pyspark import SparkConf, SparkContext
import logging
directory = "data_path"
inp = sc.textFile(directory).map(lambda row: row.split(" "))
model = word2vec_run(inp)
model.save(sc, "/data/word2vec_model")

Are there any simple ways to achieve the goal?


Answer:

Read more

Leave a Reply

Your email address will not be published. Required fields are marked *