Last week, I ran DBpedia Code using the existing hyperparameters which were tuned for Freebase dataset. So, the hyper parameters had to properly tuned for the DBpedia subsets, and then compared based on the evaluations.
After discussing with the mentors, we decided that we would take 3 sizes of the different DBpedia subsets. The sizes would range from 10^4, 10^5, 10^6 of the train size (where valid/test set= 10% of train size).
These two weeks, I could complete 10^4, 10^5 size train for SET1, SET3 each, total 4 sets.
For TransE, ComplEx, and DistMult, these were the following hyperparameters :
learning rate, batch size, maximum iteration, embedding size, lambda, neg ratio, maximum iteration, contiguous sampling.
Tuning all the hyperparameters would be a very time taking process. Even if we limit our search space to just 3 different values for each hyperparameter, it would take 3^8 different permutations for each approach. So, I decided to take just two important hyperparameter learning rate and lambda for tuning, which means 3^2= 9 different grids, and later handtune some of the hyperparameters when required. I fixed the embedding size to 200. So it was 3 * 3^2 for three approaches for one subset. Total there were 4 subsets, so that made 4* 3* 3^2 = 108 permutations, phewww!!.