Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
G
gpma_bfs
Manage
Activity
Members
Labels
Plan
Issues
2
Issue boards
Milestones
Wiki
Code
Merge requests
0
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Model registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
This is an archived project. Repository and other project resources are read-only.
Show more breadcrumbs
Recolic
gpma_bfs
Commits
978a252c
There was an error fetching the commit references. Please try again later.
Commit
978a252c
authored
5 years ago
by
Recolic Keghart
Browse files
Options
Downloads
Patches
Plain Diff
add gpu_factor for load_balance, adjust main logic
parent
54047695
No related branches found
No related tags found
No related merge requests found
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
gpma_bfs_demo.cu
+3
-5
3 additions, 5 deletions
gpma_bfs_demo.cu
multidev.cuh
+3
-1
3 additions, 1 deletion
multidev.cuh
with
6 additions
and
6 deletions
gpma_bfs_demo.cu
+
3
−
5
View file @
978a252c
...
...
@@ -38,11 +38,8 @@ int main(int argc, char **argv) {
char
*
data_path
=
argv
[
1
];
int
bfs_start_node
=
std
::
atoi
(
argv
[
2
]);
#if CUDA_SM >= 60
// heap size limit is KNOWN to be required at SM_75(Tesla T4),SM_61(Tesla P4), and KNOWN to be forbidden at SM_50(GEForce 750).
cudaDeviceSetLimit
(
cudaLimitMallocHeapSize
,
1024ll
*
1024
*
1024
);
cudaDeviceSetLimit
(
cudaLimitMallocHeapSize
,
1024ll
*
1024
*
700
);
cudaDeviceSetLimit
(
cudaLimitDevRuntimeSyncDepth
,
5
);
#endif
thrust
::
host_vector
<
int
>
host_x
;
thrust
::
host_vector
<
int
>
host_y
;
...
...
@@ -64,7 +61,8 @@ int main(int argc, char **argv) {
int
step
=
half
/
num_slide
;
LOG_TIME
(
"before init_csr_gpma"
)
GPMA_multidev
<
1
,
1
>
gpma
(
node_size
);
constexpr
size_t
cpu_count
=
4
;
GPMA_multidev
<
cpu_count
-
1
,
1
>
gpma
(
node_size
);
cudaDeviceSynchronize
();
LOG_TIME
(
"before update_gpma 1"
)
...
...
This diff is collapsed.
Click to expand it.
multidev.cuh
+
3
−
1
View file @
978a252c
...
...
@@ -33,6 +33,7 @@ namespace gpma_impl {
:
mapKeyToSlot
(
hashSize
,
(
size_t
)(
-
1
))
{}
// void init(const CpuArrT &ptrs_cpu, const GpuArrT &ptrs_gpu) {}
static
constexpr
size_t
gpu_factor
=
7
;
// 1 GPU is equals to 7 CPU.
// Given KEY, returns the ID(offset from zero) of device, which is responsible to this KEY.
[[
gnu
::
always_inline
]]
size_t
select_device
(
const
KEY_TYPE
&
k
)
{
...
...
@@ -41,7 +42,8 @@ namespace gpma_impl {
auto
dev_id
=
mapKeyToSlot
.
get
(
hashKey
);
if
(
dev_id
==
(
size_t
)(
-
1
))
{
// appoint a device for a new hash.
dev_id
=
hashKey
%
(
cpu_instances
+
gpu_instances
);
dev_id
=
hashKey
%
(
cpu_instances
+
gpu_instances
*
gpu_factor
);
dev_id
=
(
dev_id
>
cpu_instances
)
?
(
cpu_instances
+
(
dev_id
-
cpu_instances
)
/
gpu_factor
)
:
dev_id
;
// Add link: hashKey => dev_id
return
mapKeyToSlot
.
set
(
hashKey
,
dev_id
);
}
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment