2018년 4월 4일 수요일

Pytorch v0.3.0 빌드, Docker image 만들기 (Ubuntu 16.04, xenial, ppc64le, Minsky)

http://hwengineer.blogspot.kr/search?q=pytorch

이미 같은 내용의 과정이 위 블로그에 나와있습니다. 아래 내용은 위 블로그와는 다른 버전의 Pytorch v0.3.0를 빌드, Anaconda3을 이용하였습니다. 

1. Pytorch 0.3.0 on Minsky (ubuntu 16.03, ppc64le) 빌드하기

아나콘다 3 버전 설치하기
root@minsky:/home/minsky# wget https://repo.continuum.io/archive/Anaconda3-5.1.0-Linux-ppc64le.sh
root@minsky:/home/minsky# chmod u+x Anaconda3-5.1.0-Linux-ppc64le.sh
root@minsky:/home/minsky# ./Anaconda3-5.1.0-Linux-ppc64le.sh
설치 경로는 /opt/anaconda3 으로 설정함

root@minsky:/home/minsky# . /root/.bashrc
root@minsky:/home/minsky# env | grep PATH
LD_LIBRARY_PATH=/usr/lib/openmpi/lib:/usr/lib/mpich/lib:
PATH=/opt/anaconda3/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
root@minsky:/home/minsky# whicn conda
/opt/anaconda3/bin/conda
root@ubuntu:/home/minsky# export CMAKE_PREFIX_PATH=/opt/anaconda3
root@minsky:/home/minsky# conda install numpy pyyaml setuptools cmake cffi openblas 

Pytorch github 에서 소스 복제, 빌드하기
root@minsky:/home/minsky# mkdir data
root@minsky:/home/minsky# cd data
root@minsky:/home/minsky/data# git clone --recursive https://github.com/pytorch/pytorch/
root@ubuntu:/home/minsky/data/pytorch# git checkout tags/v0.3.0
root@ubuntu:/home/minsky/data/pytorch# python setup.py install
git clone https://github.com/pytorch/pytorch

2. Docker image로 Pytorch v0.3.0 빌드하기
우선 Docker image 내에 설치할 관련 파일들을 모두 /docker_img 디렉토리에 모아 놓습니다. 기본 base image를 nvidia에서 제공하는 nvidia/cuda-ppc64le:8.0-cudnn6-devel-ubuntu16.04 로 사용할 것이기 때문에, cuda와 libcudnn 별도 설치는 필요 없습니다.

root@minsky:/home/minsky/docker_img# ls -al
total 2370376
drwxr-xr-x 2 root root 4096 Apr 4 00:15 .
drwxr-xr-x 18 minsky minsky 4096 Apr 4 00:11 ..
-rwxr--r-- 1 root root 299557404 Apr 4 00:11 Anaconda3-5.1.0-Linux-ppc64le.sh
-rw-r--r-- 1 root root 386170568 Apr 4 00:12 mldl-repo-local_4.0.0_ppc64el.deb

root@minsky:/home/minsky# nvidia-docker run -ti -v /home/minsky/docker_img:/docker nvidia/cuda-ppc64le:8.0-cudnn6-devel-ubuntu16.04
Unable to find image 'nvidia/cuda-ppc64le:8.0-cudnn6-devel-ubuntu16.04' locally
8.0-cudnn6-devel-ubuntu16.04: Pulling from nvidia/cuda-ppc64le
Digest: sha256:1ca56d91ac9c3045383ec4ea2789499f9ac1a62f2daa9a447539c4e147ff0518
Status: Downloaded newer image for nvidia/cuda-ppc64le:8.0-cudnn6-devel-ubuntu16.04

root@0e1b46e8b699:/# cd docker
root@0e1b46e8b699:/docker# ls -al
total 2370364
drwxr-xr-x 2 root root 4096 Apr 4 04:49 .
drwxr-xr-x 1 root root 4096 Apr 4 04:53 ..
-rwxr--r-- 1 root root 299557404 Apr 4 04:11 Anaconda3-5.1.0-Linux-ppc64le.sh
-rw-r--r-- 1 root root 386170568 Apr 4 04:12 mldl-repo-local_4.0.0_ppc64el.deb
root@0e1b46e8b699:/docker#

root@0e1b46e8b699:/docker# dpkg -i mldl-repo-local_4.0.0_ppc64el.deb
Selecting previously unselected package mldl-repo-local.
Preparing to unpack mldl-repo-local_4.0.0_ppc64el.deb ...
Unpacking mldl-repo-local (4.0.0) ...
Setting up mldl-repo-local (4.0.0) ...
OK

root@0e1b46e8b699:/docker# apt-get update
root@0e1b46e8b699:/docker# apt-get install -y libnccl-dev libnccl1 python-ncclient bazel libopenblas-dev libopenblas libopenblas-base git vim

위에서 했던 1. Pytorch Build 하기 과정을 컨테이너 안에서 동일하게 수행
다른 터미널에서 해당 도커 컨테이너의 ID를 확인하여 Commit 수행

root@minsky:/home/minsky# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0e1b46e8b699 nvidia/cuda-ppc64le:8.0-cudnn6-devel-ubuntu16.04 "/bin/bash" 30 minutes ago Up 30 minutes festive_stallman

root@minsky:/home/minsky# docker commit 0e1b46e8b699 brlee/pytorch-ppc64le-xenial:v0.3.0

댓글 없음:

댓글 쓰기