1 Environments as Code

1.1 Environments:

Building a completly reproducible environement is a “fool’s errand” but first step should be easy.

(any trouble with renv and sf anyone?)

Hardware and System should be in the hand of IT (see later chapter 7 and 14), packages layer should be the data scientist.

Package can in 3 places:

Each project should have it’s own “pantry”

Project was higlighted in text but I think it is important: if you do not have a project workflow it is way harder to do it.

A package environement shouldbe :

isolated and cannot be disrupted (example updating a packge in an other project)
can be “captured” and “transported”

In R: {Renv} (“light”/“not exactly the same” option also exist, Box, capsule)

Author does not like Conda (good to not being alone!)

(spend time exploring renv/ and .gitignore)

Document environment state (see lockfile)
Collaborate / deploy: you can’t share package because their binay can be OS or system specific, hence specific package need to be installed (could be a pain point).
Use virtual env