Join the community!
Visit our GitHub or LinkedIn page to join the Tilburg Science Hub community, or check out our contributors' Hall of Fame!
Want to change something or add new content? Click the Contribute button!
There is quite some material to cover to make sure your workflows become efficient, reproducible, and well-structured.
Here’s a checklist you can use to audit your progress.
|At the project level|
|Implement a consistent directory structure: data/src/gen|
|Include readme with project description and technical instruction how to run/build the project|
|Store any authentication credentials outside of the repository (e.g., in a JSON file), NOT clear-text in source code|
|At the level of each stage of your pipeline|
|Create subdirectory for source code:
|Create subdirectories for generated files in
|Make all file names relative, and not absolute (i.e., never refer to C:/mydata/myproject, but only use relative paths, e.g., ../output)||☐||☐||☐||☐|
|Create directory structure from within your source code, or use .gitkeep||☐||☐||☐||☐|
|Automation and Documentation|
|Alternatively, include a readme with running instructions||☐||☐|
|Make dependencies between source code and files-to-be-built explicit, so that
|Include function to delete temp, output files, and audit files in makefile||☐||☐||☐||☐|
|Version all source code stored in
|Do not version any files in
|Want to exclude additional files (e.g., files that (unintentionally) get written to
|Have short and accessible variable names||☐||☐||☐||☐|
|Loop what can be looped||☐||☐||☐||☐|
|Break down “long” source code in subprograms/functions, or split script in multiple smaller scripts||☐||☐||☐||☐|
|Delete what can be deleted (including unnecessary comments, legacy calls to packages/libraries, variables)||☐||☐||☐||☐|
|Use of asserts (i.e., make your program crash if it encounters an error which is not recognized as an error)||☐||☐||☐||☐|
|Testing for portability|
|Tested on own computer (entirely wipe
|Tested on own computer (first clone to new directory, then re-build the entire project using
|Tested on different computer (Windows)||☐||☐||☐||☐|
|Tested on different computer (Mac)||☐||☐||☐||☐|
|Tested on different computer (Linux)||☐||☐||☐||☐|
Versioned any sensitive data?
Before making a GitHub repository public, we recommend you check that you have not stored any sensitive information in it (such as any passwords). This tool has worked great for us: GitHub credentials scanner.