# 28 February 2018

In the Node2Vec paper, the authors propose that an embedding for every node in a graph can be learned by trying to maximize the dot product between the feature representation of a node and its neighbors, as determined by BFS (useful for structural equivalence) or DFS (useful for learning homophily, or finding nodes that are connected together).

I think that the strategy for homophily makes sense, but the BFS strategy doesn’t make sense. Imagine that you have have a simple graph that has two two nodes $$A$$ and $$B$$, and each are surrounded by 5 neighbors, and let’s assume that the neighbors are only connected to $$A$$ and $$B$$ respectively. In that case, if you wanted to learn structural equivalence, then $$A$$ and $$B$$ should have a similar embedding, while the neighbors should all have similar embeddings.

However, the objective function that the authors propose won’t achieve this. Instead, it will propose that $$A$$ and its neighbors have similar embeddings, and that $$B$$ and its neighbors have similar embeddings. Could this be ameliorated by having a “left” and “right” embedding vector for each node? Then, the objective function would be that you maximize the dot product between the left vector of a node, and the right vector of each of its neighbors?

Written on February 28, 2018