Skip to content

glm::isnan returns incorrect result in CUDA device code #727

@oliver-om

Description

@oliver-om

TL;DR

The definition of glm::isnan is infinitely recursive when compiling CUDA code, and returns incorrect results inside CUDA kernels.

Problem

When calling the scalar function glm::isnan inside CUDA kernels, it returns false for values that are NaN. Here's a minimal example that demonstrates:

#define GLM_FORCE_CUDA
#include <cuda.h>
#include <math_constants.h>
#include <cstdio>
#include <glm/glm.hpp>

__global__ void isnan_kernel() {
  const bool test_glm = glm::isnan(CUDART_NAN_F);
  const bool test_cuda = ::isnan(CUDART_NAN_F);
  printf("glm::isnan(CUDART_NAN_F) = %s\n", test_glm ? "true" : "false");
  printf("::isnan(CUDART_NAN_F) = %s\n", test_cuda ? "true" : "false");
}

int main(int argc, char* argv[]) {
  isnan_kernel<<<1, 1>>>();
  cudaDeviceSynchronize();
  return 0;
}

This code can be compiled with the command nvcc --gpu-architecture=compute_61 --gpu-code=sm_61 -g -O3 main.cu -o isnan_test. When I run it on my machine (workstation with GeForce GTX 1080, running Ubuntu 16.04 LTS, CUDA 8.0.61, NVIDIA driver version 384), I get the following output:

./isnan_test
glm::isnan(CUDART_NAN_F) = false
::isnan(CUDART_NAN_F) = true

If I modify the last line of the program above, and replace return 0; with return glm::isnan(0.0f);, the executable never terminates (due to the infinite recursion).

Diagnosis

Here's the implementation of glm::isnan from glm/detail/func_common.inl:

#	if GLM_HAS_CXX11_STL
		using std::isnan;
#	else
		template <typename genType> 
		GLM_FUNC_QUALIFIER bool isnan(genType x)
		{
			GLM_STATIC_ASSERT(std::numeric_limits<genType>::is_iec559, "'isnan' only accept floating-point inputs");

#			if GLM_HAS_CXX11_STL
				return std::isnan(x);
#			elif GLM_COMPILER & GLM_COMPILER_VC
				return _isnan(x) != 0;
#			elif GLM_COMPILER & GLM_COMPILER_INTEL
#				if GLM_PLATFORM & GLM_PLATFORM_WINDOWS
					return _isnan(x) != 0;
#				else
					return ::isnan(x) != 0;
#				endif
#			elif (GLM_COMPILER & (GLM_COMPILER_GCC | (GLM_COMPILER_APPLE_CLANG | GLM_COMPILER_LLVM))) && (GLM_PLATFORM & GLM_PLATFORM_ANDROID) && __cplusplus < 201103L
				return _isnan(x) != 0;
#			elif GLM_COMPILER & GLM_COMPILER_CUDA
				return isnan(x) != 0;
#			else
				return std::isnan(x);
#			endif
		}
#	endif

When GLM_COMPILER & GLM_COMPILER_CUDA is true, glm::isnan calls the unqualified isnan. Namespace lookup rules dictate that the enclosing namespace should be searched for the definition of isnan before searching the global scope. Thus, isnan is resolved to be glm::isnan, and the function definition becomes infinitely recursive.

Proposed solution

Instead of calling the unqualified isnan, the implementation should probably explicitly call the global scope ::isnan.

Other notes

  • I didn't investigate, but I imagine that glm::isfinite might suffer from the same problem
  • It's unclear how, or why, the call to glm::isnan terminates inside the CUDA kernel. When you call glm::isnan in host-side code compiled by NVCC, it does not terminate.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions