Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision | |||
technical:recipes:tensorflow-in-virtualenv [2021-02-24 23:46] – frey | technical:recipes:tensorflow-in-virtualenv [2021-03-08 19:26] (current) – [VALET Package Definition] anita | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== TensorFlow Python Virtual Environment ====== | ||
+ | |||
+ | This page documents the creation of a Python virtual environment (virtualenv) containing the TensorFlow software for machine learning on the Caviness HPC system((The steps should also work on the DARWIN HPC system, though with different package versions.)). | ||
+ | |||
+ | ===== Prepare Workgroup Directory ===== | ||
+ | |||
+ | Prepare to add software in the standard sub-directories of the workgroup storage: | ||
+ | |||
+ | <code bash> | ||
+ | [user@login01 ~]$ workgroup -g my_workgroup | ||
+ | [(my_workgroup: | ||
+ | [(my_workgroup: | ||
+ | </ | ||
+ | |||
+ | These commands create any missing directories. | ||
+ | |||
+ | ===== Create TensorFlow Virtualenv ===== | ||
+ | |||
+ | The Intel Python distribution will form the basis for the Keras virtualenv, so add it to the environment: | ||
+ | |||
+ | <code bash> | ||
+ | [(my_workgroup: | ||
+ | Adding package `intel-python/ | ||
+ | (base) [(my_workgroup: | ||
+ | </ | ||
+ | |||
+ | Notice the prompt changed: | ||
+ | |||
+ | The '' | ||
+ | |||
+ | <code bash> | ||
+ | (base) [frey@login00 ~]$ conda search ' | ||
+ | Loading channels: done | ||
+ | # Name | ||
+ | tensorflow | ||
+ | tensorflow | ||
+ | tensorflow | ||
+ | tensorflow | ||
+ | tensorflow | ||
+ | tensorflow | ||
+ | tensorflow | ||
+ | tensorflow | ||
+ | tensorflow | ||
+ | |||
+ | (base) [frey@login00 ~]$ conda search ' | ||
+ | Loading channels: done | ||
+ | # Name | ||
+ | tensorflow | ||
+ | tensorflow | ||
+ | tensorflow | ||
+ | </ | ||
+ | |||
+ | All versions of the TensorFlow virtualenv will be stored in the common base directory, '' | ||
+ | |||
+ | <code bash> | ||
+ | [(my_workgroup: | ||
+ | 2.2.0-gpu | ||
+ | [(my_workgroup: | ||
+ | |||
+ | [(my_workgroup: | ||
+ | 2.3.0-intel-python3.8 | ||
+ | [(my_workgroup: | ||
+ | </ | ||
+ | |||
+ | The virtualenvs are created using the '' | ||
+ | |||
+ | <code bash> | ||
+ | (base) [(my_workgroup: | ||
+ | WARNING: A directory already exists at the target location '/ | ||
+ | but it is not a conda environment. | ||
+ | Continue creating environment (y/[n])? y | ||
+ | |||
+ | : | ||
+ | |||
+ | Preparing transaction: | ||
+ | Verifying transaction: | ||
+ | Executing transaction: | ||
+ | # | ||
+ | # To activate this environment, | ||
+ | # | ||
+ | # $ conda activate / | ||
+ | # | ||
+ | # To deactivate an active environment, | ||
+ | # | ||
+ | # $ conda deactivate | ||
+ | </ | ||
+ | |||
+ | We're **not** going to activate that virtualenv -- we will install the other one next: | ||
+ | |||
+ | <code bash> | ||
+ | (base) [(it_nss: | ||
+ | WARNING: A directory already exists at the target location '/ | ||
+ | but it is not a conda environment. | ||
+ | Continue creating environment (y/[n])? y | ||
+ | |||
+ | : | ||
+ | |||
+ | Preparing transaction: | ||
+ | Verifying transaction: | ||
+ | Executing transaction: | ||
+ | # | ||
+ | # To activate this environment, | ||
+ | # | ||
+ | # $ conda activate / | ||
+ | # | ||
+ | # To deactivate an active environment, | ||
+ | # | ||
+ | # $ conda deactivate | ||
+ | </ | ||
+ | |||
+ | Ignore that '' | ||
+ | |||
+ | <code bash> | ||
+ | (base) [(my_workgroup: | ||
+ | [(my_workgroup: | ||
+ | </ | ||
+ | |||
+ | Notice the '' | ||
+ | |||
+ | ===== VALET Package Definition ===== | ||
+ | |||
+ | Assuming the workgroup does //not// already have a TensorFlow VALET package definition, the following text: | ||
+ | |||
+ | <file tensorflow.vpkg_yaml> | ||
+ | tensorflow: | ||
+ | prefix: / | ||
+ | description: | ||
+ | flags: | ||
+ | - no-standard-paths | ||
+ | actions: | ||
+ | - action: source | ||
+ | script: | ||
+ | sh: anaconda-activate.sh | ||
+ | order: failure-first | ||
+ | success: 0 | ||
+ | versions: | ||
+ | " | ||
+ | description: | ||
+ | dependencies: | ||
+ | - intel-python/ | ||
+ | " | ||
+ | description: | ||
+ | dependencies: | ||
+ | - intel-python/ | ||
+ | </ | ||
+ | |||
+ | would be added to '' | ||
+ | |||
+ | <file tensorflow.vpkg_yaml> | ||
+ | tensorflow: | ||
+ | prefix: / | ||
+ | description: | ||
+ | flags: | ||
+ | - no-standard-paths | ||
+ | actions: | ||
+ | - action: source | ||
+ | script: | ||
+ | sh: anaconda-activate.sh | ||
+ | order: failure-first | ||
+ | success: 0 | ||
+ | versions: | ||
+ | " | ||
+ | description: | ||
+ | dependencies: | ||
+ | - intel-python/ | ||
+ | " | ||
+ | description: | ||
+ | dependencies: | ||
+ | - intel-python/ | ||
+ | " | ||
+ | description: | ||
+ | dependencies: | ||
+ | - intel-python/ | ||
+ | </ | ||
+ | |||
+ | <note warning> | ||
+ | |||
+ | <note tip>On Caviness after a user has used the '' | ||
+ | |||
+ | With a properly-constructed package definition file, you can now check for your versions of TensorFlow: | ||
+ | |||
+ | <code bash> | ||
+ | [(it_nss: | ||
+ | |||
+ | Available versions in package (* = default version): | ||
+ | |||
+ | [/ | ||
+ | tensorflow | ||
+ | * 2.2.0: | ||
+ | 2.3.0: | ||
+ | | ||
+ | : | ||
+ | </ | ||
+ | |||
+ | ===== Job Scripts ===== | ||
+ | |||
+ | Any job scripts you submit that want to run scripts using this virtualenv should include something like the following toward its end: | ||
+ | |||
+ | < | ||
+ | # | ||
+ | # Setup TensorFlow virtualenv: | ||
+ | # | ||
+ | vpkg_require tensorflow/ | ||
+ | |||
+ | # | ||
+ | # Run a Python script in that virtualenv: | ||
+ | # | ||
+ | python3 my_tf_work.py | ||
+ | rc=$? | ||
+ | |||
+ | # | ||
+ | # Do cleanup work, etc.... | ||
+ | # | ||
+ | |||
+ | # | ||
+ | # Exit with whatever exit code our Python script handed back: | ||
+ | # | ||
+ | exit $rc | ||
+ | </ | ||