When there might be multiple APIs, the mathmatical part should be the same across all implementations.
As shown in the Cross-entropy - Wikipedia, the loss taken the log-propability as the input. So the pytorch’s impl should be correct.
When there might be multiple APIs, the mathmatical part should be the same across all implementations.
As shown in the Cross-entropy - Wikipedia, the loss taken the log-propability as the input. So the pytorch’s impl should be correct.