static_assert(cpus<2&&gpus<2,"Current code shares the same bitmap/nodeQ/edgeQ for all CPUs. Also share them for all GPUs. So at most 1 cpu/gpu allowed.");
static_assert(/*cpus < 2 && */gpus<2,"Current code shares the same bitmap/nodeQ/edgeQ for all CPUs. Also share them for all GPUs. So at most 1 cpu/gpu allowed. cba97278");
// If count is larger than 1, This function deal with [dev_id, dev_id+count) in SERIAL, not PARALLALIZED! Search for d5fb72b0 and cba97278 if you want to parallelize it.
// If you want to parallelize it, you should just lock edge_queue. there's no need to lock bitmap, node_queue...