Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Summarize the bug "Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!" #510

Open
3 tasks done
Nikolatesla-lj opened this issue Apr 7, 2022 · 4 comments
Labels
bug Something isn't working

Comments

@Nikolatesla-lj
Copy link

Nikolatesla-lj commented Apr 7, 2022

Checklist

Describe the issue

Steps to reproduce the bug

1.run the vis_pred.py 
2.terminals prompts RuntimeError:Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

Error message

/home/ub/anaconda3/envs/Genv3D/bin/python home/ub/Downloads/Open3D-ML-master/examples/vis_pred.py
INFO - 2022-04-07 14:08:39,636 - semantic_segmentation - Loading checkpoint /home/ljian/Downloads/Open3D-ML-master/examples/vis_weights_RandLANet.pth
INFO - 2022-04-07 14:08:41,694 - semantic_segmentation - Loading checkpoint /home/ljian/Downloads/Open3D-ML-master/examples/vis_weights_KPFCNN.pth
test 0/1: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍| 78416/78726 [00:02<00:00, 33596.46it/s]Traceback (most recent call last):
File "/home/ub/Downloads/Open3D-ML-master/examples/vis_pred.py", line 163, in
main()
File "/home/ub/Downloads/Open3D-ML-master/examples/vis_pred.py", line 151, in main
pcs_with_pred = pred_custom_data(pc_names, pcs, pipeline_r, pipeline_k)
File "/home/ub/Downloads/Open3D-ML-master/examples/vis_pred.py", line 40, in pred_custom_data
results_r = pipeline_r.run_inference(data)
File "/home/ub/anaconda3/envs/Genv3D/lib/python3.8/site-packages/open3d/_ml3d/torch/pipelines/semantic_segmentation.py", line 172, in run_inference
valid_scores, valid_labels = filter_valid_label(
File "/home/ub/anaconda3/envs/Genv3D/lib/python3.8/site-packages/open3d/_ml3d/torch/modules/losses/semseg_loss.py", line 19, in filter_valid_label
valid_scores = torch.gather(valid_scores, 0,
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
No response

Expected behavior

No response

Open3D, Python and System information

- Operating system: (Ubuntu 20.04)
- Python version: (e.g. Python 3.8)
- Open3D version: (open3d_ML 0.15.1)
- Is this remote workstation?: no
- How did you install Open3D?: (conda)

Additional information

No response

@Nikolatesla-lj Nikolatesla-lj added the bug Something isn't working label Apr 7, 2022
@conby
Copy link

conby commented Apr 26, 2022

Same here,

Open3D, Python and System information

  • Operating system: (Ubuntu 18.04, Nvidia Jetson/aarch64)
  • Python version: (e.g. Python 3.6)
  • Open3D version: (open3d_ML 0.15.1)
  • Is this remote workstation?: no
  • How did you install Open3D?: (build from source)

$ python3 vis_pred.py
Open3D was not compiled with BUILD_GUI, but script is importing open3d.visualization.gui
Open3D was not compiled with BUILD_GUI, but script is importing open3d.visualization.rendering


Using the Open3D PyTorch ops with CUDA 11 may have stability issues!

We recommend to compile PyTorch from source with compile flags
'-Xcompiler -fno-gnu-unique'

or use the PyTorch wheels at
https://github.com/isl-org/open3d_downloads/releases/tag/torch1.8.2

Ignore this message if PyTorch has been compiled with the aforementioned
flags.

See isl-org/Open3D#3324 and
pytorch/pytorch#52663 for more information on this
problem.


INFO - 2022-04-26 23:32:03,045 - semantic_segmentation - Loading checkpoint /home/x/work/Open3D-ML/examples/vis_weights_RandLANet.pth
INFO - 2022-04-26 23:32:18,718 - semantic_segmentation - Loading checkpoint /home/x/work/Open3D-ML/examples/vis_weights_KPFCNN.pth
test 0/1: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉| 78711/78726 [00:05<00:00, 15310.08it/s]Traceback (most recent call last):
File "vis_pred.py", line 167, in
main()
File "vis_pred.py", line 155, in main
pcs_with_pred = pred_custom_data(pc_names, pcs, pipeline_r, pipeline_k)
File "vis_pred.py", line 40, in pred_custom_data
results_r = pipeline_r.run_inference(data)
File "/home/x/.local/lib/python3.6/site-packages/open3d/_ml3d/torch/pipelines/semantic_segmentation.py", line 175, in run_inference
model.cfg.ignored_label_inds, device)
File "/home/x/.local/lib/python3.6/site-packages/open3d/_ml3d/torch/modules/losses/semseg_loss.py", line 20, in filter_valid_label
valid_idx.unsqueeze(-1).expand(-1, num_classes))
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper_gather)
test 0/1: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 78726/78726 [00:13<00:00, 5852.46it/s]

@lovelyyoshino
Copy link

lovelyyoshino commented Apr 26, 2022

I can see the bug in

        metric = SemSegMetric()
        valid_scores, valid_labels = filter_valid_label(
            torch.tensor(inference_result['predict_scores']).to(device),
            torch.tensor(data['label']), model.cfg.num_classes,
            model.cfg.ignored_label_inds, device)
        metric.update(valid_scores, valid_labels)
        log.info(f"Accuracy : {metric.acc()}")
        log.info(f"IoU : {metric.iou()}")

which can see in #435

@RuiMargarido
Copy link

We have had this error while training PointTransformer on a CustomDataset. We managed to fix this, by changing run_inference in semantic_segmentation.py.

We changed line 161, to replicate what is going on in run_test(), ie:
Add

if hasattr(inputs['data'], 'to'):
    inputs['data'].to(device)

before

results = model(inputs['data'])

@anuzk13
Copy link

anuzk13 commented Mar 13, 2024

Thank you @RuiMargarido this worked for me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants