Commit ec997c4e authored by hoffmaps's avatar hoffmaps
Browse files

Removed extra files not needed for testing

parent 1bee8d8f
library(ggplot2)
library(scales)
library(lubridate)
library(stringr)
library(rPython)
library(Rcpp)
top_dir <- "/home/mikola/slurm_simulator3/slurm_sim_tools/validation"
setwd(top_dir)
source("../Rutil/trace_job_util.R")
sdiag<-read.csv("sdiag.csv")
sdiag$sdiag_output_time <- as.POSIXct(sdiag$sdiag_output_time,format = "%Y-%m-%d %H:%M:%S")
sdiag$jobs_pending <- sdiag$jobs_submitted - sdiag$jobs_started - sdiag$jobs_completed -sdiag$jobs_canceled-sdiag$jobs_failed
plot(sdiag$sdiag_output_time, sdiag$jobs_pending)
This diff is collapsed.
---
title: "Slurm Simulator: Micro Cluster Tutorial"
output:
pdf_document:
latex_engine: xelatex
monofont: "DejaVu Sans Mono"
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
# Praparing Trace Jobs from Micro Cluster
This installation guide is tested on fresh installation of CenOS 7 (CentOS-7-x86_64-DVD-1611.iso KDE Plasma Workspaces with Development Tools)
Since there are many absolute paths in slurm.conf, it can be helpful to create a separate user for slurm named *slurm* and use it for Slurm Simulator.
The following directory structure is used here (the respective directories will be created on appropriate steps during the tutorial)
```{bash, eval=FALSE}
/home/slurm - Slurm user home derectory
└── slurm_sim_ws - Slurm simulator work space
├── bld_opt - Slurm simulato building directory
├── sim - Directory where simulation will be performed
├── slurm_opt - Slurm simulator binary installation directory
├── slurm_sim_tools - Slurm simulator toolkit
└── slurm_simulator - Slurm simulator source code
```
# Installing Dependencies
## Slurm Simulator Dependencies
### Install MySQL (MariaDB in this case)
Install mariadb server and devel packages:
```{bash, eval=FALSE}
sudo yum install mariadb-server
sudo yum install mariadb-devel
```
Enable and start mariadb server:
```{bash, eval=FALSE}
sudo systemctl enable mariadb
sudo systemctl start mariadb
```
Run mysql_secure_installation for more secure installation if needed.
If sql server is not accessible from the outside it is ok not to run it
```{bash, eval=FALSE}
sudo mysql_secure_installation
```
## Slurm Simulator Toolkit Dependencies
### Python
Install python3 with pymysql and pandas packages:
```{bash, eval=FALSE}
sudo yum -y install install epel-release
sudo yum -y install python34 python34-libs python34-devel python34-numpy python34-scipy python34-pip
sudo pip3 install pymysql
sudo pip3 install pandas
```
### R
Install R:
```{bash, eval=FALSE}
sudo yum -y install R R-Rcpp R-Rcpp-devel
sudo yum -y install python-devel
sudo yum install texlive-*
```
Install R-Studio:
```{bash, eval=FALSE}
wget https://download1.rstudio.org/rstudio-1.0.136-x86_64.rpm
sudo yum -y install rstudio-1.0.136-x86_64.rpm
```
In R-Studio or plain R install depending packages:
```{r, eval=FALSE}
install.packages("ggplot2")
install.packages("gridExtra")
install.packages("cowplot")
install.packages("lubridate")
install.packages("rPython")
install.packages("rstudioapi")
```
# Prepering Slurm Simulator Workspace
Create work space for Slurm simulation activities:
```{bash, eval=FALSE}
cd
mkdir slurm_sim_ws
cd slurm_sim_ws
```
# Installing Slurm Simulator
Obtain Slurm Simulator source code with git:
```{bash, eval=FALSE}
git clone https://github.com/nsimakov/slurm_simulator.git
cd slurm_simulator
```
Ensure what slurm-17.02_Sim branch is used:
```{bash, eval=FALSE}
git branch
```
```
Output:
* slurm-17.02_Sim
```
If it is not the case checkout proper branch:
```{bash, eval=FALSE}
git fetch
git checkout slurm-17.02_Sim
```
Prepare builing directory
```{bash,eval=FALSE}
cd ..
mkdir bld_opt
cd bld_opt
```
Run configure:
```{bash,eval=FALSE}
../slurm_simulator/configure --prefix=/home/slurm/slurm_sim_ws/slurm_opt --enable-simulator \
--enable-pam --without-munge --enable-front-end --with-mysql-config=/usr/bin/ --disable-debug \
CFLAGS="-g -O3 -D NDEBUG=1"
```
Check config.log and ensure that mysql is found:
```{bash,eval=FALSE}
configure:4672: checking for mysql_config
configure:4690: found /usr/bin//mysql_config
```
Check that openssl is found:
```{bash,eval=FALSE}
configure:24145: checking for OpenSSL directory
configure:24213: gcc -o conftest -g -O3 -D NDEBUG=1 -pthread -I/usr/include -L/usr/lib \
conftest.c -lcrypto >&5
configure:24213: $? = 0
configure:24213: ./conftest
configure:24213: $? = 0
configure:24234: result: /usr
```
Slurm can work without MySQL or OpenSSL so if they are not found slurm still can be configured and built.
However in most cases these libraries would be needed for simulation.
Compile and install binaries:
```{bash,eval=FALSE}
make -j install
```
# Installing Slurm Simulator Toolkit
Obtaine Slurm Simulator Toolkit with git:
```{bash,eval=FALSE}
cd ~/slurm_sim_ws
git clone https://github.com/nsimakov/slurm_sim_tools.git
```
---
title: "Analysing SLRUM Real Output and Preparing for Simulation"
output:
html_document: default
html_notebook: default
---
```{r setup, echo=TRUE, results="hide",warning=TRUE,message=FALSE}
library(ggplot2)
library(gridExtra)
library(scales)
library(lubridate)
library(stringr)
library(rPython)
library(Rcpp)
#some global locations
top_dir <- "/home/mikola/slurm_simulator3/slurm_sim_tools/validation"
real_top_dir <- "/home/mikola/slurm_simulator3/slurm_real/5"
sim_top_dir <- "/home/mikola/slurm_simulator3/sim/micro3/results"
setwd(top_dir)
source(file.path(top_dir,"../Rutil/trace_job_util.R"))
```
# Reads Data
```{r}
init_trace <- read.csv(file.path(top_dir,"test_trace.csv"))
init_trace$sim_submit <- as.POSIXct(init_trace$sim_submit,format = "%Y-%m-%d %H:%M:%S")
init_trace$sim_dependency <- ""
sacct_r <- read_sacct_out(file.path(real_top_dir,"slurm_acct.out"))
sacct_r$ref_job_id <- as.integer(sub("\\.sh","",sacct_r$JobName))
```
# Which jobs was not done
Sometimes jobs might not run, so lets check did all jobs ran or not.
```{r}
jobs_not_done <- setdiff(init_trace$sim_job_id,sacct_r$ref_job_id)
print(paste("Number of jobs which were not run:",length(jobs_not_done)))
```
```{r}
print(init_trace[init_trace$sim_job_id %in% jobs_not_done,])
```
# Plotting Jobs Submittion and Execution Times
```{r , fig.width=10, fig.height=6}
ggplot(sacct_r)+
geom_point(aes(x=local_job_id,y=Submit,colour="Submit Time"))+
geom_segment(aes(x=local_job_id,y=Start,xend=local_job_id,yend=End,colour="Run Time"))+
scale_colour_manual("",values = c("red","blue", "green"))
```
# Preparing for Simulator
Sometimes some jobs might not run, but we still want to run simulator while waiting for new results
from real Slurm run. Lets select only jobs which were actually used in real Slurm Run. Here it is all now.
Lets modify sumbit time in init_trace and job_id to set it from real Slurm run and feed it later to simulator.
```{r}
start_time <- min(sacct_r$Submit)
init_trace_start_time <- min(init_trace$sim_submit)
dt<-start_time-init_trace_start_time
init_trace$old_submit <- init_trace$sim_submit
init_trace$sim_submit <- init_trace$old_submit+dt
new_trace <- merge(init_trace,sacct_r,by.x = "sim_job_id", by.y = "ref_job_id",
suffixes = c("",".sacct_r"))
#let the job id be like in real slurm
new_trace$sim_job_id <- new_trace$JobID
submit_not_na <- !is.na(new_trace$Submit)
print(paste("NA submits:",sum(is.na(new_trace$Submit))))
new_trace$sim_submit[submit_not_na]<-new_trace$Submit[submit_not_na]
new_trace$sim_submit_ts <- as.integer(new_trace$sim_submit)
new_trace<-new_trace[order(new_trace$sim_submit_ts),]
new_trace$sim_duration_old <- new_trace$sim_duration
new_trace$sim_duration <- as.integer(unclass(new_trace$End)-unclass(new_trace$Start))
sum(abs(new_trace$sim_duration-new_trace$sim_duration_old))
#lets write traces for simulator
write_trace2(file.path(top_dir,"test_after_slurm_real.trace"),new_trace)
```
---
title: "Analysing SLRUM Real and Simulated Backfill"
output:
html_document: default
html_notebook: default
---
```{r setup, echo=TRUE, results="hide",warning=TRUE,message=FALSE}
library(ggplot2)
library(gridExtra)
library(scales)
library(lubridate)
library(stringr)
library(rPython)
library(Rcpp)
library(plyr)
#some global locations
top_dir <- "/home/mikola/slurm_simulator3/slurm_sim_tools/validation"
real_top_dir <- "/home/mikola/slurm_simulator3/slurm_real/5"
sim_top_dir <- "/home/mikola/slurm_simulator3/sim/micro3/results"
setwd(top_dir)
source("../Rutil/trace_job_util.R")
```
# Reads Data
```{r}
sacct_r <- read_sacct_out(file.path(real_top_dir,"slurm_acct.out"))
sacct_r$ref_job_id <- as.integer(sub("\\.sh","",sacct_r$JobName))
sacct_r$Slurm <- "Real"
sacct_r$NTasks <- NULL
sacct_r$ReqGRES <- NULL
sacct_s <- read_sacct_out(file.path(sim_top_dir,"jobcomp.log"))
sacct_s$ref_job_id <- as.integer(sacct_s$JobName)
sacct_s$Slurm <- "Simulated"
sacctM <- merge(sacct_r,sacct_s,by="local_job_id",all=TRUE,suffixes = c("_r","_s"))
sacctRB <- rbind(sacct_r,sacct_s)
```
```{r}
bf_s <- read.csv(file.path(sim_top_dir,"simstat_backfill.csv"))
colnames(bf_s)[colnames(bf_s) == 'output_time'] <- 't'
for(col in c("t","last_cycle_when"))bf_s[,col] <- as.POSIXct(bf_s[,col],format = "%Y-%m-%d %H:%M:%S")
#drop duplicates
bf_s<-bf_s[bf_s$last_cycle_when>as.POSIXct("2001-01-01"),]
bf_s<-bf_s[!duplicated(bf_s$last_cycle_when),]
bf_s$t <- bf_s$last_cycle_when
bf_s$run_sim_time <- bf_s$last_cycle/1000000.0
sdiag_r <- read.csv(file.path(real_top_dir,"sdiag.csv"))
for(col in c("sdiag_output_time","data_since","backfil_stats__last_cycle_when"))sdiag_r[,col] <- as.POSIXct(sdiag_r[,col],format = "%Y-%m-%d %H:%M:%S")
bf_r <- sdiag_r[sdiag_r$backfil_stats__last_cycle_when>as.POSIXct("2001-01-01"),c(
"backfil_stats__last_cycle_when",
"backfil_stats__last_cycle",
"backfil_stats__last_depth_cycle",
"backfil_stats__last_depth_cycle_try_sched",
"backfil_stats__last_queue_length"
)]
colnames(bf_r) <- sub("backfil_stats__","",colnames(bf_r))
#drop duplicates
bf_r<-bf_r[!duplicated(bf_r$last_cycle_when),]
bf_r$t <- bf_r$last_cycle_when
bf_r$run_real_time <- bf_r$last_cycle/1000000.0
```
# Plots
```{r, fig.width=20, fig.height=8}
grid.arrange(
ggplot(bf_r,aes(x=t,y=run_real_time))+ggtitle("Real")+
geom_point(size=1,colour="blue",alpha=0.25),
ggplot(bf_r,aes(x=last_depth_cycle_try_sched,y=run_real_time))+ggtitle("Real")+
geom_point(size=1,colour="blue",alpha=0.25)+stat_function(fun = function(x){0.004856*(x^0.575349)},n=100),
ggplot(bf_r,aes(x=log10(last_depth_cycle_try_sched),y=log10(run_real_time)))+ggtitle("Real")+
geom_point(size=1,colour="blue",alpha=0.25)+geom_smooth(method = "lm", colour = "black",formula = y~x),
ggplot(bf_s,aes(x=t,y=run_sim_time))+ggtitle("Simulated")+
geom_point(size=1,colour="blue",alpha=0.25),
ggplot(bf_s,aes(x=last_depth_cycle_try_sched,y=run_sim_time))+ggtitle("Simulated")+
geom_point(size=1,colour="blue",alpha=0.25),
ggplot(bf_s,aes(x=log10(last_depth_cycle_try_sched),y=log10(run_sim_time)))+ggtitle("Simulated")+
geom_point(size=1,colour="blue",alpha=0.25)+geom_smooth(method = "lm", colour = "black",formula = y~x),
ggplot(bf_s,aes(x=t,y=run_real_time))+ggtitle("Simulated Actual")+
geom_point(size=1,colour="blue",alpha=0.25),
ggplot(bf_s,aes(x=last_depth_cycle_try_sched,y=run_real_time))+ggtitle("Simulated Actual")+
geom_point(size=1,colour="blue",alpha=0.25),
ggplot(bf_s,aes(x=log10(last_depth_cycle_try_sched),y=log10(run_real_time)))+ggtitle("Simulated Actual")+
geom_point(size=1,colour="blue",alpha=0.25)+geom_smooth(method = "lm", colour = "black",formula = y~x),
ncol=3)
```
# Fit
```{r, fig.width=20, fig.height=8}
fit_r <- lm(log10(run_real_time)~log10(last_depth_cycle_try_sched),bf_r[bf_r$last_depth_cycle_try_sched!=0,])
summary(fit_r)
Yr<-coef(fit_r)[[1]]
Kr<-coef(fit_r)[[2]]
b_r <- Kr
a_r <- 10**(Yr/Kr)
nls(run_real_time~a*last_depth_cycle_try_sched^b,start = list(a = 1, b = 3),data=bf_r[bf_r$last_depth_cycle_try_sched!=0,])
nls(run_real_time~a*last_depth_cycle_try_sched^b,start = list(a = 1, b = 3),data=bf_s[bf_s$last_depth_cycle_try_sched!=0,],
control = nls.control(maxiter = 500))
nls(run_real_time~a*last_depth_cycle_try_sched,start = list(a = 1),data=bf_r[bf_r$last_depth_cycle_try_sched!=0,])
nls(run_real_time~a*last_depth_cycle_try_sched,start = list(a = 1),data=bf_s[bf_s$last_depth_cycle_try_sched!=0,],
control = nls.control(maxiter = 500))
fit_s <- lm(log10(run_real_time)~log10(last_depth_cycle_try_sched),bf_s[bf_s$last_depth_cycle_try_sched!=0,])
```
---
title: "Analysing SLRUM Real and Simulated Output"
output:
html_document: default
html_notebook: default
---
```{r setup, echo=TRUE, results="hide",warning=TRUE,message=FALSE}
library(ggplot2)
library(gridExtra)
library(scales)
library(lubridate)
library(stringr)
library(rPython)
library(Rcpp)
library(plyr)
#some global locations
top_dir <- "/home/mikola/slurm_simulator3/slurm_sim_tools/validation"
real_top_dir <- "/home/mikola/slurm_simulator3/slurm_real/s3"
sim_top_dir <- "/home/mikola/slurm_simulator3/sim/micro/results/StartSecondsBeforeFirstJob_45"
setwd(top_dir)
source("../Rutil/trace_job_util.R")
source("micro_conf.R")
```
# Reads Data
```{r}
init_start_time <- as.POSIXct("2017-03-01")
init_trace <- read.csv(file.path(top_dir,"test_trace.csv"))
init_trace$sim_submit <- as.POSIXct(init_trace$sim_submit,format = "%Y-%m-%d %H:%M:%S")
init_trace$sim_dependency <- ""
dt <- min(as.integer(init_trace$sim_submit))-as.integer(init_start_time)
print(paste("dt:",dt))
init_trace$sim_submit<-init_trace$sim_submit-dt
init_trace$sim_submit_ts <- as.integer(init_trace$sim_submit)
sacct_r <- read_sacct_out(file.path(real_top_dir,"slurm_acct.out"),micro_nodes)
sacct_r$ref_job_id <- as.integer(sub("\\.sh","",sacct_r$JobName))
sacct_r$Slurm <- "Real"
sacct_r$NTasks <- NULL
sacct_r$ReqGRES <- NULL
#shift time
dt <- min(as.integer(sacct_r$Submit))-as.integer(init_start_time)
print(paste("dt:",dt))
sacct_r[,c("Submit","Eligible","Start","End")]<-sacct_r[,c("Submit","Eligible","Start","End")]-dt
print(paste("Simulation time:",max(sacct_r$End)-min(sacct_r$Submit)))
sacct_s <- read_sacct_out(file.path(sim_top_dir,"jobcomp.log"),micro_nodes)
sacct_s$ref_job_id <- as.integer(sacct_s$JobName)
sacct_s$Slurm <- "Simulated"
#shift time
dt <- min(as.integer(sacct_s$Submit))-as.integer(init_start_time)
print(paste("dt:",dt))
sacct_s[,c("Submit","Eligible","Start","End")]<-sacct_s[,c("Submit","Eligible","Start","End")]-dt
print(paste("Simulation time:",max(sacct_s$End)-min(sacct_s$Submit)))
sacctM <- merge(sacct_r,sacct_s,by="local_job_id",all=TRUE,suffixes = c("_r","_s"))
sacctRB <- rbind(sacct_r,sacct_s)
```
## Checking that reference job_id matches
```{r}
print(paste("job id difference in real (which is ok):",sum(sacct_r$ref_job_id -sacct_r$local_job_id)))
print(paste("job id difference in simulated:",sum(sacct_s$ref_job_id -sacct_s$local_job_id)))
print(paste("users different between real and simulated:",sum(sacctM$User_r!=sacctM$User_s)))
print(paste("timelimit different between real and simulated:",sum(sacctM$Timelimit_r!=sacctM$Timelimit_s)))
print(paste("NCPUs different between real and simulated:",sum(sacctM$NCPUS_r!=sacctM$NCPUS_s)))
```
# Single simulation
```{r , fig.width=20, fig.height=6}
grid.arrange(
ggplot(data=sacctM)+
geom_point(aes(x=local_job_id,y=Submit_r,colour="Submit Time"))+
geom_segment(aes(x=local_job_id,y=Start_r,xend=local_job_id,yend=End_r,colour="Run Time"))+
geom_segment(aes(x=local_job_id,y=Start_s,xend=local_job_id,yend=End_s,colour="Run Time Sim"))+
scale_colour_manual("",values = c("red","blue", "green")),
ggplot(data=sacctM)+
geom_point(aes(x=local_job_id,y=(unclass(Start_s)-unclass(Start_r))/3600.0,colour="Submit Time")),
ncol=2
)
```
# Proper Node Assignment
```{r}
sacctRB_withReq <- merge(sacctRB,init_trace,by.x="local_job_id",by.y="sim_job_id")
#GPU Nodes
jobs_in_quiestion <- sum(sacctRB_withReq$sim_gres!="")
run_on_propernode <- sum(sacctRB_withReq$Nodes_G[sacctRB_withReq$sim_gres!=""]>0)
print(paste("Jobs asked for GPU but ended up on non wrong nodes:",jobs_in_quiestion-run_on_propernode))
#Big Mem Nodes
jobs_in_quiestion <- sum(sacctRB_withReq$sim_req_mem>400000,na.rm = TRUE)
run_on_propernode <- sum(sacctRB_withReq$Nodes_B[(!is.na(sacctRB_withReq$sim_req_mem)) & sacctRB_withReq$sim_req_mem>400000]>0)
print(paste("Jobs asked for Big Mem but ended up on non wrong nodes:",jobs_in_quiestion-run_on_propernode))
#CPU-N Nodes
jobs_in_quiestion <- sum(sacctRB_withReq$sim_features=="CPU-N",na.rm = TRUE)
run_on_propernode <- sum(sacctRB_withReq$Nodes_N[sacctRB_withReq$sim_features=="CPU-N"]>0)
print(paste("Jobs asked for CPU-N but ended up on non wrong nodes:",jobs_in_quiestion-run_on_propernode))
#CPU-M Nodes
jobs_in_quiestion <- sum(sacctRB_withReq$sim_features=="CPU-M",na.rm = TRUE)
run_on_propernode <- sum(sacctRB_withReq$Nodes_M[sacctRB_withReq$sim_features=="CPU-M"]>0)
print(paste("Jobs asked for CPU-M but ended up on non wrong nodes:",jobs_in_quiestion-run_on_propernode))
```
# Utilization
```{r}
dt <- 60L
util_s <- get_utilization(sacct_s,micro_nodes,dt)
util_r <- get_utilization(sacct_r,micro_nodes,dt)
util_s$Slurm <- "Simulated"
util_r$Slurm <- "Real"
util<-rbind(util_s,util_r)
```
```{r , fig.width=20, fig.height=6}
ggplot(data=util)+
geom_line(aes(x=t,y=total_norm,colour=Slurm))
```
---
title: "Analysing SLRUM Real and Simulated Output"
output:
html_document: default
html_notebook: default
---
```{r setup, echo=TRUE, results="hide",warning=TRUE,message=FALSE}
library(ggplot2)
library(gridExtra)
library(scales)
library(lubridate)
library(stringr)
library(stringi)
library(rPython)
library(Rcpp)
library(plyr)
#some global locations
top_dir <- "/home/mikola/slurm_simulator3/slurm_sim_tools/validation"
real_top_dir <- "/home/mikola/slurm_simulator3/slurm_real/5"
sim_top_dir <- "/home/mikola/slurm_simulator3/sim/micro3/results"
setwd(top_dir)
source("../Rutil/trace_job_util.R")
```
# Reads Data
```{r}
sacct_r <- read_sacct_out(file.path(real_top_dir,"slurm_acct.out"))
sacct_r$ref_job_id <- as.integer(sub("\\.sh","",sacct_r$JobName))
sacct_r$Slurm <- "Real"
sacct_r$NTasks <- NULL
sacct_r$ReqGRES <- NULL
sacct_r <- sacct_r[order(sacct_r$Start,sacct_r$local_job_id),]
sacct_r$istart <- seq_along(sacct_r$Start)
sacct_s <- read_sacct_out(file.path(sim_top_dir,"jobcomp.log"))
sacct_s$ref_job_id <- as.integer(sacct_s$JobName)
sacct_s$Slurm <- "Simulated"
sacct_s <- sacct_s[order(sacct_s$Start,sacct_s$local_job_id),]
sacct_s$istart <- seq_along(sacct_s$Start)
#shift time
dt <- min(as.integer(sacct_s$Submit)-as.integer(sacct_r$Submit))
print(paste("dt:",dt))
sacct_s[,c("Submit","Eligible","Start","End")]<-sacct_s[,c("Submit","Eligible","Start","End")]-dt
sacctM <- merge(sacct_r,sacct_s,by="local_job_id",all=TRUE,suffixes = c("_r","_s"))