-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
copy_if failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered #350
Comments
This is exactly the error you get for pytorch 1.8 + cuda 11. Could you please make sure you used pytorch 1.7 in a new conda environment, compile again and run the code again? I am pretty sure you might have misconfigured it to use 1.8 + cuda11. |
Thanks for your quick reply. I just created a new conda environment and installed pytorch 1.7.1 with cuda 11.0. I also installed MinkowskiEngine from source with the following command: conda create -n py37 python=3.7
conda activate py37
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0 -c pytorch
git clone https://github.com/NVIDIA/MinkowskiEngine.git
cd MinkowskiEngine
pip install -U . -v --no-deps --install-option="--blas_include_dirs=${CONDA_PREFIX}/include" --install-option="--blas=openblas" But I got the same error. Here is the environment configs: ==========System==========
Linux-4.14.224-llgrid-10ms-x86_64-with-debian-buster-sid
DISTRIB_ID=GridOS
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="GridOS 18.04.5"
3.7.10 (default, Feb 26 2021, 18:47:35)
[GCC 7.3.0]
==========Pytorch==========
1.7.1
torch.cuda.is_available(): True
==========NVIDIA-SMI==========
/usr/bin/nvidia-smi
Driver Version 450.80.02
CUDA Version 11.0
VBIOS Version 88.00.7E.00.03
Image Version G500.0202.00.02
==========NVCC==========
/usr/local/pkg/cuda/cuda-11.0/bin/nvcc
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Wed_Jul_22_19:09:09_PDT_2020
Cuda compilation tools, release 11.0, V11.0.221
Build cuda_11.0_bu.TC445_37.28845127_0
==========CC==========
/usr/bin/c++
c++ (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
==========MinkowskiEngine==========
0.5.3
MinkowskiEngine compiled with CUDA Support: True
NVCC version MinkowskiEngine is compiled: 11000
CUDART version MinkowskiEngine is compiled: 11000 |
I also created a new conda environment and installed pytorch via pip: pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html And I got same error as well. Here is the config: ==========System==========
Linux-4.14.224-llgrid-10ms-x86_64-with-debian-buster-sid
DISTRIB_ID=GridOS
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="GridOS 18.04.5"
3.7.10 (default, Feb 26 2021, 18:47:35)
[GCC 7.3.0]
==========Pytorch==========
1.7.1+cu110
torch.cuda.is_available(): True
==========NVIDIA-SMI==========
/usr/bin/nvidia-smi
Driver Version 450.80.02
CUDA Version 11.0
VBIOS Version 88.00.7E.00.03
Image Version G500.0202.00.02
==========NVCC==========
/usr/local/pkg/cuda/cuda-11.0/bin/nvcc
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Wed_Jul_22_19:09:09_PDT_2020
Cuda compilation tools, release 11.0, V11.0.221
Build cuda_11.0_bu.TC445_37.28845127_0
==========CC==========
/usr/bin/c++
c++ (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
==========MinkowskiEngine==========
0.5.3
MinkowskiEngine compiled with CUDA Support: True
NVCC version MinkowskiEngine is compiled: 11000
CUDART version MinkowskiEngine is compiled: 11000 |
RuntimeError Traceback (most recent call last)
<ipython-input-1-27eb35e5986d> in <module>
5 bcoords, bfeats = coordinates.cuda(), coordinates.cuda()
6 print(bcoords, bfeats) # without print, it works fine... print seems to be triggering something
----> 7 ME.SparseTensor(bfeats, bcoords)
~/MinkowskiEngine/MinkowskiEngine/MinkowskiSparseTensor.py in __init__(self, features, coordinates, tensor_stride, coordinate_map_key, coordinate_manager, quantization_mode, allocator_type, minkowski_algorithm, requires_grad, device)
270 )
271 coordinates, features, coordinate_map_key = self.initialize_coordinates(
--> 272 coordinates, features, coordinate_map_key
273 )
274 else: # coordinate_map_key is not None:
~/MinkowskiEngine/MinkowskiEngine/MinkowskiSparseTensor.py in initialize_coordinates(self, coordinates, features, coordinate_map_key)
298 coordinate_map_key,
299 (unique_index, inverse_mapping),
--> 300 ) = self._manager.insert_and_map(coordinates, *coordinate_map_key.get_key())
301 self.unique_index = unique_index.long()
302 coordinates = coordinates[self.unique_index]
~/MinkowskiEngine/MinkowskiEngine/MinkowskiCoordinateManager.py in insert_and_map(self, coordinates, tensor_stride, string_id)
177 """
178 tensor_stride = convert_to_int_list(tensor_stride, self.D)
--> 179 return self._manager.insert_and_map(coordinates, tensor_stride, string_id)
180
181 def insert_field(
RuntimeError: copy_if failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered
|
After several installing and reinstalling of the environment and cuda, it seems that |
Hi,
I am installing MinkowskiEngine from source (commit ). When I ran the following code, I got cuda error:
RuntimeError: copy_if failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered
.I noticed this post, but I have tested with both
pytorch 1.8.1+cu111
andpytorch 1.7.1+cu110
, and both gives the same error.Here are two configurations I tried:
The text was updated successfully, but these errors were encountered: